Rendering 3d From Scratch Chapter 4 - The Screen

Last time we talked about projecting the faces of an object onto a camera plane. We wrote some code that will take a face in 3d space, and project it onto a plane. There’s a big problem, though. All the points are still in 3d space! Sure, they’re all on the same 2d plane, and that’s progress, but our ultimate goal is to draw some pixels on a screen. For that we need some nice (x,y) points.

Let’s figure it out. Imagine our camera plane from before with some points on it from one of our shapes:

As a thought exercise, let’s pretend we know the coordinates of all these points, but they’re all in 3d. Is there a way we can figure out some “normalized” coordinate for this point? By normalized here, I mean that it would be a point from (0, 0) to (1, 1), where 0 in the x would be the left side and 1 would be the right, and 0 in the y would be the bottom and 1 would be the top.

Last time we learned about the dot product, and how it equals the cosine of the angle between two vectors multiplied by the magnitudes of each vector. The formula looks like:

$$a·b=||a||\space||b||\space{cos(\theta)}$$

Can we use this? Let’s draw a triangle (99% of problems can be solved by finding a right triangle):

A right triangle with one of the points from our shape

We would like to find the magnitude of the line a (which is our x coordinate) and b (which is our y coordinate). Let’s find a first, because you can use the same approach to find b.

From SOHCAHTOA we know

$$cos(\theta)={||a||\over{||c||}}$$

So

$$||a||=||c||\space{cos(\theta)}$$

And using our dot product from last time, we can take the dot product of the vector c, which is known, and the unit vector of a (which is easy to calculate if we have our four corners):

$$\hat{a}·c=||a||\space||c||\space{cos(\theta)}$$

A cancels out, because the magnitude of a unit vector is 1.

$$\hat{a}·c=||c||\space{cos(\theta)}$$

There you have it. That is actually pretty simple, and once again is a testament to the power of the dot product! We know now that if we can figure out the coordinates of the corners of our camera plane, we can determine actual 2d pixel coordinates for any 3d point on that plane.

Here we can employ another incredibly valuable tool from linear algebra. The cross product! The cross product of two vectors will give us a vector that is perpendicular to both vectors.

Cross product of two vectors gives a third vector which is orthogonal to both vectors

So, we have our plane normal. It points away from our plane. If we take the cross product of that, and a vector heading straight up (which is just (0, 1, 0)), we will get a vector which points sideways along our plane.

We can multiply this vector by the size of our camera to get the extents of our camera’s view. Then, to get the top and bottom of the camera, we take the cross product of this sideways vector with the plane’s normal. This will give us a vector heading “up” along the plane (“up” here is in relation to the plane).

Once again, we can multiply this by the size to get the top and bottom.

Once we have our left, right, top and bottom, we can use our dot product trick from above to convert all of our plane coordinates into normalized screen coordinates. Then, we can multiply these by the width and height of whatever screen or image we are rendering to in order to get actual pixel coordinates.

At this point, we have a conceptual idea of how to convert the faces of an object in the 3d world to points on a 2s screen, but we haven’t written the code to do so. Next time, we’ll cover that code in detail!