This is the sixth video in the Prologue of the Performance-Aware Programming series. It discusses one of five multipliers that cause programs to be slow. Please see the Table of Contents to quickly navigate through the rest of the course as it is updated weekly. A lightly-edited transcript of the video appears below.
In all of the previous videos, we referred to the thing that was running our program as a “core” or a “CPU”, but we never really talked about what that means, or why there were apparently two terms for the same thing. Today we’re going to look more closely at the concept of “cores” and how they facilitate the final multiplier that causes programs to go much slower than they should: multithreading.
Multithreading for performance is just the simple idea that if one computer can do something at a certain speed, two computers should be able to do it at a faster speed. Ideally, if we have two computers instead of one, it gets twice as fast. Could it get more than twice as fast? It seems unlikely at first glance, but, we’ll look more closely at that later.
In the consumer space, this idea started off as simple as “two computers are faster than one”. If we put two separate physical CPU packages into a machine, then we can make a single physical computer that gains some or all of the performance that would normally require two physical computers. And it also becomes easier for the CPUs to communicate with each other, since they’re now connected directly rather than by a network cable or something similar.
Today, since chip fabrication technology has gotten more and more advanced, CPU designers are able to pack dramatically more things on a chip than they used to. A single modern CPU package may contain multiple connected chips, and those chips might each contain multiple cores, each the equivalent of an entire CPU from early generations. The result is that it is highly unliky that any modern CPU has less than four complete cores on it, often more than four.
So technology has advanced to the point where now any standard consumer computer is really several computers. Each of the many cores in a modern computer is capable of running its own instruction stream independently. If you write software that can’t take advantage of the increased performance offered by these multiple instruction streams, you give up a large performance multiplier.
How large? That’s what we’re going to look at now.