This is the fourteenth video in Part 1 of the Performance-Aware Programming series. Please see the Table of Contents to quickly navigate through the rest of the course as it is updated weekly. The companion listings (58, 59, 60, 61, 62, 63, and 64) are available on the github.
In the next two weeks, we will wrap up Part One with a code review and a final Q&A. But otherwise, this is the last actual lesson Part One.
Part One was designed to make sure that, before we more forward to performance analysis, everyone is comfortable understanding what a CPU is doing at a low level. Throughout the rest of this course, we’ll constantly be looking at ASM listings to see why things behave the way they behave. Reading ASM is, in some sense, the foundation skill that all our other skills will build on. Whenever we’re not sure why we got a particular result, we will inspect the ASM to find out what’s really going on.
Even though we now have to step up to modern x64 chips, whose performance is more difficult to analyze, the ASM itself doesn’t get significantly more complicated. In fact, most of the instructions you will see apart from SIMD extensions are nearly identical to what you’ve already learned on the 8086. Most of the work you’ve done in this part carries forward without modification — despite forty years of continual evolution, the core instructions of the 8086 persist to this day!
So we will have to learn new instructions for SIMD. But we really won’t have to learn much of anything for the rest of the ASM we’re likely to see on x64.
And that’s exactly what we’ll see first-hand in this post. There are a few slight changes that we’d notice if we tried to read x64 ASM right now. I’d like to briefly go over what they are, and then show you some actual x64 assembly listings so you can see just how familiar it is already.