In Computer Graphics , the perspective matrix is a key tool that transforms the view frustum into a rectangular cuboid like so.

(redraw this diagram in light of ©️ 📖Fundamentals of Computer Graphics )

Note

This post focuses on perspective matrix. For a broader understanding of viewing, camera, projection, and viewport transformations, check out 📺Viewing Transformation: from Camera to Screen.

📝Definition

The perspective matrix $P$ is defined as:

P = \begin{bmatrix} n & 0 & 0 & 0\\ 0 & n & 0 & 0\\ 0 & 0 & n+f & -fn\\ 0 & 0 & 1 & 0\\ \end{bmatrix},

where

$n$ is the distance to the near plane, and
$f$ is the distance to the far plane.

🗣Step by Step Explanation

Let’s start from scratch, assuming we don’t know the matrix yet. All we know is

the matrix is 4x4
the input is a point, therefore a 4-vector in homogenous coordinates
the output is also a point, 4-vector

\begin{bmatrix} a_1 & a_2 & a_3 & a_4 \\ b_1 & b_2 & b_3 & b_4 \\ c_1 & c_2 & c_3 & c_4 \\ d_1 & d_2 & d_3 & d_4 \end{bmatrix} \begin{bmatrix} x\\y\\z\\1 \end{bmatrix} = \begin{bmatrix}x'\\y'\\z'\\1\end{bmatrix}

Our goal is to find the relationships between the input and output to determine the perspective matrix.

🔍Observation 1️⃣: Proportional

We start with a key observation:

Lemma 1

The perspective matrix leaves points on the near plane( $z = n$ ) unchanged but squishes other points so that they share the same $x$ and $y$ as their “complement” in the near plane( $z=n$ ).
^d2acd0

After squishing:

🟢 shares the same $x$ and $y$ as 🟡
🔵 shares the same $x$ and $y$ as 🔴

We can now start to establish the relationship here! See the point after squishing, there is always a point in the near plane having the same value as $y'$ . In the meantime, this 🟤point is related to the point before squishing.

Lemma 2

==The value $y$ and $y'$ are proportional to the $z$ value==! (similar triangles)

See the diagram below.

Hence, for arbitrary $x$ and $y$ , we can derive:

\begin{align} \frac{y'}{n} &= \frac{y}{z} \implies y' = \frac{ny}{z} \\ \frac{x'}{n} &= \frac{x}{z} \implies x' = \frac{nx}{z} \end{align}

Thus, the matrix multiplication can be expressed as:

\begin{align} \begin{bmatrix} a_1 & a_2 & a_3 & a_4 \\ b_1 & b_2 & b_3 & b_4 \\ c_1 & c_2 & c_3 & c_4 \\ d_1 & d_2 & d_3 & d_4 \end{bmatrix} \begin{bmatrix} x\\y\\z\\1 \end{bmatrix} &= \begin{bmatrix}x'\\y'\\z'\\1\end{bmatrix} \\ &=\begin{bmatrix}\frac{nx}{z}\\\frac{ny}{z}\\?\\1\end{bmatrix} \end{align}

Remark

Note that

the 3rd value $z'$ is still unknown.

the 4th value is $1$ as the output is still a point.

According to the properties in homogeneous coordinates, we have:

Lemma 3

In homogenous coordinates, multiplying a nonzero scalar $k$ to a point is still that point. Meaning, $\mathbf{p}_1 = (x_1, y_1, z_1, w_1)^T$ and $\mathbf{p}_2 = k(x_2, y_2, z_2, w_2)^T$ are considered equivalent.

We then can multiply the output vector $\displaystyle \begin{bmatrix}\frac{nx}{z}\\\frac{ny}{z}\\?\\1\end{bmatrix}$ by $z$ :

\begin{bmatrix}\frac{nx}{z}\\\frac{ny}{z}\\?\\1\end{bmatrix} = \begin{bmatrix}{nx}\\ny\\?\\z\end{bmatrix} \qquad (\text{the same point})

,and it is still the same point!

Now, the matrix multiplication becomes:

\begin{align} \begin{bmatrix} a_1 & a_2 & a_3 & a_4 \\ b_1 & b_2 & b_3 & b_4 \\ c_1 & c_2 & c_3 & c_4 \\ d_1 & d_2 & d_3 & d_4 \end{bmatrix} \begin{bmatrix} x\\y\\z\\1 \end{bmatrix} &=\begin{bmatrix}\frac{nx}{z}\\\frac{ny}{z}\\?\\1\end{bmatrix} \\ &=\begin{bmatrix}{nx}\\ny\\?\\z\end{bmatrix} \\ x \begin{bmatrix} a_1 \\ b_1 \\ c_1 \\ d_1 \end{bmatrix} + y \begin{bmatrix} a_2 \\ b_2 \\ c_2 \\ d_2 \end{bmatrix} + z \begin{bmatrix} a_3 \\ b_3 \\ c_3 \\ d_3 \end{bmatrix} + 1 \begin{bmatrix} a_4 \\ b_4 \\ c_4 \\ d_4 \end{bmatrix} &=\begin{bmatrix}{nx}\\ny\\?\\z\end{bmatrix} \end{align}

By examining each row:

✅1st row: $a_1 x + a_2 y + a_3 z + a_4 = nx$ . We are pretty sure the only possibility is that $a_1=n$ where others are all $0$ .
✅2nd row: $b_1 x + b_2 y + b_3 z + b_4=ny$ . We are also pretty sure the only possibility is that $b_2 = n$ where others are all $0$ .
❌3rd row: we don’t have enough information to decode this row.
✅4th row: $d_1 x + d_2 y + d_3 z + d_4 = z$ . We are pretty sure that the $d_3=1$ where others are all $0$ .

Wonderful! So far, the matrix looks like this:

\begin{bmatrix} n & 0 & 0 & 0\\ 0 & n & 0 & 0\\ c_1 & c_2 & c_3 & c_4\\ 0 & 0 & 1 & 0\\ \end{bmatrix} \begin{bmatrix} x\\y\\z\\1 \end{bmatrix} = \begin{bmatrix}{nx}\\ny\\?\\z\end{bmatrix}

🔍Observation 2️⃣: Points Don’t Move

From the last observation, we are only left with the 3rd row unknown. Apparently, this row is related to the $z$ value of a point. Recalling the lemma1 that we notice 2 kinds of special points don’t move after transformation:

🟡points on the near plane: after the squishing, the points on the near plane stick to the original positions.
🔴the center point of the far plane: other points in this plane got squished towards center while only this point snaps to the original position.

We can take this observation algebraically.

For any 🟡 points in the near plane which don’t change, we have

$\displaystyle \begin{align}\begin{bmatrix}n & 0 & 0 & 0\\0 & n & 0 & 0\\c_1&c_2&c_3&c_4\\0 & 0 & 1 & 0\\\end{bmatrix}\begin{bmatrix}x\\y\\\textcolor{orange}{n}\\1\end{bmatrix}&=\begin{bmatrix}x\\y\\\textcolor{orange}{n}\\1\end{bmatrix}\end{align}$
See the point in homogeneous coordinates(multiply $n$ ), we then have
$\displaystyle \begin{align}\begin{bmatrix}n & 0 & 0 & 0\\0 & n & 0 & 0\\c_1&c_2&c_3&c_4\\0 & 0 & 1 & 0\\\end{bmatrix}\begin{bmatrix}x\\y\\\textcolor{orange}{n}\\1\end{bmatrix}&=\begin{bmatrix}nx\\ny\\\textcolor{orange}{n^2}\\n\end{bmatrix}\end{align}$
Look at the 3rd row, we got
$\displaystyle c_1x + c_2y + c_3n + c_4= n^2$

For the 🔴 center point in the far plane which don’t change, we have

$\displaystyle \begin{align}\begin{bmatrix}n & 0 & 0 & 0\\0 & n & 0 & 0\\c_1 & c_2 & c_3 & c_4\\0 & 0 & 1 & 0\\\end{bmatrix}\begin{bmatrix}0\\0\\\textcolor{orange}{f}\\1\end{bmatrix}&=\begin{bmatrix}0\\0\\\textcolor{orange}{f}\\1\end{bmatrix}\end{align}$
See the point in homogeneous coordinates(multiply $f$ ), we then have
$\displaystyle \begin{align}\begin{bmatrix}n & 0 & 0 & 0\\0 & n & 0 & 0\\ c_1 & c_2 & c_3 & c_4\\0 & 0 & 1 & 0\\\end{bmatrix}\begin{bmatrix} 0\\0\\\textcolor{orange}{f}\\1\end{bmatrix}&=\begin{bmatrix} 0\\0\\\textcolor{orange}{f^2}\\f\end{bmatrix}\end{align}$
Look at the 3rd row, we got
$\displaystyle c_1x + c_2y + c_3f + c_4= f^2$

Now let’s put them together

\begin{align} c_1x + c_2y + c_3n + c_4 &= n^2 \\ c_1x + c_2y + c_3f + c_4 &= f^2 \\ \end{align}

Seeing the right hand sides are $n^2$ and $f^2$ , we are very sure about that the $c_1$ and $c_2$ must be $0$ since we don’t want anything related to $x$ and $y$ involved. This simplifies the equations to:

\begin{align} c_3n + c_4 &= n^2 \\ c_3f + c_4 &= f^2 \\ \end{align}

Boom🤯! This is a system of 2 equations that there are 2 unknown and 2 equations. We can easily solve this and got

\begin{align} c_3 &= n+f \\ c_4 &= -fn. \\ \end{align}

Finally, we solve this perspective matrix:

P = \begin{bmatrix} n & 0 & 0 & 0\\ 0 & n & 0 & 0\\ 0 & 0 & n+f & -fn\\ 0 & 0 & 1 & 0\\ \end{bmatrix}.

👀See Also

For more on transformations, check out these videos:

How to Derive the Perspective Matrix with 2 Observations?

📝Definition

🗣Step by Step Explanation

🔍Observation 1️⃣: Proportional

🔍Observation 2️⃣: Points Don’t Move

👀See Also