This is the ninth video in Part 3 of the Performance-Aware Programming series. Please see the Table of Contents to quickly navigate through the rest of the course as it is updated weekly. The listings referenced in the video (listings 125, 126, 127, and 128) are available on the github.
By popular demand, in the past few videos, we took a more in-depth look at virtual memory than I was originally planning to include in this course. We don't have to worry that much about virtual memory for most of the things we do, so I hadn’t planned on making a whole section out of it. But, since people were curious, we ended up doing almost that.
That's fine! More knowledge is never going to hurt. It can only help us make good decisions. And as a result, we have a lot less to cover in today’s post than we would have otherwise, because we now already know about things like 2mb physical pages.
So let’s return to our haversine problem. Before we started our page table excursion, we were looking at our “first-read performance” — the amount of time it took us to read our data file off the disk (or from the OS memory cache) at startup.
When loading files, you're often reading out of memory, not external storage, because some other process recently read or wrote the same file. In general, unless we know for certain that we are not in this scenario, we shouldn’t consider the I/O speed of a traditional hard drive, or even a solid-state drive, to be our upper limit. The OS often caches file data, and when we hit that cache, we get speeds faster than typical drives provide.
Now, when we benchmarked this to see what the maximum speed of a cached read was, we hit a snag: provisioning the memory for the read actually caused the read to perform significantly slower. When we looked into why, it was clear that the periodic virtual-to-physical mapping interrupts done by the operating system were the culprit.
So the natural question we want to ask is, can we do anything about this?