Microsoft Intern Interview Question #1: Rectangle Copy
The first question was presumably designed to be the easiest. It was much easier than #3 and #4, but arguably harder than #2.
There are no Performance-Aware Programming lessons this week because it is Summer Internship 1994 Week here on Computer, Enhance! This week will feature five posts about the programming questions I was asked when I interviewed for a Microsoft summer internship way, way back in 1994. Normally scheduled lessons will resume next week.
For my Microsoft Summer internship interview in 1994, I was asked four programming questions, one by each of the four sequential interviewers. The questions were roughly ordered by difficulty, although arguably #2 is the easiest.
Question #1 was to write the C code that would copy a rectangle from one buffer to another buffer. I don’t remember the exact prototype they wanted, but it was something like this:
void CopyRect(char *BufferA, int PitchA, char *BufferB, int PitchB,
int FromMinX, int FromMinY, int FromMaxX, int FromMaxY,
int ToMinX, int ToMinY)
{
/* Fill this in */
}
A rectangle copy is a slightly more complicated version of a basic memory copy. Normally, when you write code that copies memory, you are copying one contiguous memory chunk to another — for example, you might copy 64 bytes from one memory address range to some other memory address range. In C this will typically look like a single for loop, or the equivalent.
Rectangle copies are used when the program is pretending that some chunk of memory — which is inherently one-dimensional — is actually two-dimensional. Even though memory itself is always addressed contiguously with one index, we of course pretend that we have two-dimensional storage by being clever with how we compute our memory offsets. We create a mapping from, say, an X and Y coordinate, which is two indices, into a single (one-dimensional) memory offset.
This is obviously a very common thing to do. It’s how computers are able to work with 2D things, like images or spreadsheets.
Typically, when working with 2D buffers like this, we use the term pitch to refer to the distance between an element on one row and the element in that same column on the next row. In a top-down image, for example, if we have a pointer to a pixel, adding the pitch to that pointer would give us the pixel exactly below it.
I of course can’t remember exactly what I started writing on the whiteboard to solve this problem. It’s nearly 30 years ago at this point! But my suspicion is that I started by computing the locations of the first element in both the source and the destination. To do this, you need to do some arithmetic using the minimum x and y coordinates of the two rectangles, and the base pointer and pitch of each buffer:
char *Source = BufferA + FromMinY*PitchA + FromMinX;
char *Dest = BufferB + ToMinY*PitchB + ToMinX;
We have to multiply the starting Y coordinates by the pitches, because we need to move down that many rows from the buffer base pointers. We don’t have to multiply the X coordinates by anything, since the elements are single bytes. If the elements were multiple bytes, we’d have to multiply by that byte count.
For this rectangular copy question, we are being asked to do a copy from one rectangle to another. This means that we are copying partial rows of our source buffer into partial rows of our destination buffer. This means we need two for loops, because the memory we are copying is not entirely contiguous.
The outer loop will have to loop over the number of rows that we have to copy — the height of the rectangle. The inner loop will have to loop over the number of columns, which will be a contiguous copy for each row.
Again, I’m not sure exactly what I wrote, but it was probably something like this:
int Width = FromMaxX - FromMinX;
int Height = FromMaxY - FromMinY;
for(int Y = 0; Y < Height; Y++)
{
for(int X = 0; X < Width; X++)
{
This is a very simple loop construction that just iterates over the width and height of the rectangle we are copying.
Inside the inner loop, we need to actually copy the memory. To do this, we just index our source and destination pointers with our X iterator:
Dest[X] = Source[X];
}
Finally, before we close the Y loop, we need to move our pointers so they point to the next row. We already know how to do this, since we did it in the initial setup — the pitch tells us how far to move to advance one row:
Source += PitchA;
Dest += PitchB;
}
Putting all of this together, we get the final “answer”, which is as close as I can remember to what I wrote at the time:
void CopyRect(char *BufferA, int PitchA, char *BufferB, int PitchB,
int FromMinX, int FromMinY, int FromMaxX, int FromMaxY,
int ToMinX, int ToMinY)
{
char *Source = BufferA + FromMinY*PitchA + FromMinX;
char *Dest = BufferB + ToMinY*PitchB + ToMinX;
int Width = FromMaxX - FromMinX;
int Height = FromMaxY - FromMinY;
for(int Y = 0; Y < Height; Y++)
{
for(int X = 0; X < Width; X++)
{
Dest[X] = Source[X];
}
Source += PitchA;
Dest += PitchB;
}
}
This is a pretty standard rectangle copy. Then, as now, writing this off the top of my head, I may have made mistakes. The interviewer didn’t think I made any — but of course, he may not have caught them either! But, as with any whiteboard sketch, you would not consider it finished code. You would obviously want to try it out and see if actually works, but that was not part of the interview.
Since the interview question did not ask for any particular performance considerations, that was basically it. The interviewer was satisfied I knew how to do this sort of thing in C, and we moved on.
Of course, if we did want to think about 386-era performance considerations, the answer could get a lot more complicated. I would not have known these things at the time, but you would have to think about a lot of other issues with how x86 processors worked. You’d need to look at loop unrolling, copying more than a byte at a time, changing from a pitch to a stride, etc.
And really, given the limited optimization capability of compilers at the time, if you were actually going to be copying rectangles around and cared if it was fast, you’re probably going to write this thing in assembly. There’s almost no chance you could have gotten a C compiler to output the fastest possible rectangle copy just by adjusting the C code in those days.
However, even that would have been relatively “easy” to program. The fastest possible CPU rectangle copy in the 8086-80386 days was still a tractable amount of ASM. You wouldn’t want to have to write it out on a whiteboard on the spot, but it wouldn’t be that much work to figure out.
In the modern era, things are unfortunately not so simple. If you just want to copy a rectangle of memory once or twice on the CPU, well, you could use the exact same code and turn on auto-vectorization. You’ll probably get something usable.
But if you actually need to do a lot of fast rectangle copies, it requires a lot more work. You might want to move the entire thing to the GPU, since it excels at things like rectangle copies. But if you stay on the CPU, you’re going to have to create a similar kind of structure to the graphics pipeline of a GPU: break your buffers into tiles, figure out which rectangles overlap which destination tiles, have each CPU core operate on a different tile, etc. You are basically making a pared-down GPU rasterizer if you want fast bulk rectangle copying on the CPU.
There’s unfortunately a lot more to learn if you want to understand how to take advantage of modern hardware. However, that’s exactly what we do here at Computer Enhance, and the entire reason for the Performance-Aware Programming Series. So I can’t complain!
That’s about all there is to say about question #1. Since the interviewer didn’t ask for any performance considerations, it’s a pretty easy question. Question #2 was also easy, but, the interviewer ending up making it more difficult… for himself. At the time, he probably thought he was making it more difficult for me… but as we’ll see tomorrow, it turns out he may not really have understood the subtleties of his own question!
If you’re not already a Computer Enhance subscriber and would like to sign up to have our free or paid posts delivered to your inbox, you can always pick a subscription level here:
Just wanted to point out, as is typical with whiteboard questions, there's a small indexing error that's hard to spot unless you know a few easy edge cases to check. FromMinY and FromMaxY would be the same and you would skip the outer for loop entirely even if FromMinX and FromMaxX were different.
It's the classic: if you have books 2 through 5 of a series you have 4 books problem.
Love the content though, it's really interesting to see what interview questions back in 1994 looked like.
1) Wouldn't calling memcpy instead of having an inner loop be more concise?
2) Let's say your buffer width was something like 2160 and your rectangle had a width of 2000. Is there any performance benefit potential from using something like memcpy?
3) In an interview, it feels like they often want you to think about the edge cases. In this case, I would point out the possibility that the rect from the source buffer might be too large for the destination buffer and position. This solution wouldn't handle that intuitively.