Logo Xingxin on Bug

How to Derive the Perspective Matrix with 2 Observations?

January 2, 2025
7 min read

In Computer Graphics , the perspective matrix is a key tool that transforms the view frustum into a rectangular cuboid like so.

(redraw this diagram in light of ©️ πŸ“–Fundamentals of Computer Graphics )

Note

This post focuses on perspective matrix. For a broader understanding of viewing, camera, projection, and viewport transformations, check out πŸ“ΊViewing Transformation: from Camera to Screen.

πŸ“Definition

The perspective matrix PP is defined as:

P=[n0000n0000n+fβˆ’fn0010],P = \begin{bmatrix} n & 0 & 0 & 0\\ 0 & n & 0 & 0\\ 0 & 0 & n+f & -fn\\ 0 & 0 & 1 & 0\\ \end{bmatrix},

where

  • nn is the distance to the near plane, and
  • ff is the distance to the far plane.

πŸ—£Step by Step Explanation

Let’s start from scratch, assuming we don’t know the matrix yet. All we know is

  1. the matrix is 4x4
  2. the input is a point, therefore a 4-vector in homogenous coordinates
  3. the output is also a point, 4-vector
[a1a2a3a4b1b2b3b4c1c2c3c4d1d2d3d4][xyz1]=[xβ€²yβ€²zβ€²1]\begin{bmatrix} a_1 & a_2 & a_3 & a_4 \\ b_1 & b_2 & b_3 & b_4 \\ c_1 & c_2 & c_3 & c_4 \\ d_1 & d_2 & d_3 & d_4 \end{bmatrix} \begin{bmatrix} x\\y\\z\\1 \end{bmatrix} = \begin{bmatrix}x'\\y'\\z'\\1\end{bmatrix}

Our goal is to find the relationships between the input and output to determine the perspective matrix.

πŸ”Observation 1️⃣: Proportional

We start with a key observation:

Lemma 1

The perspective matrix leaves points on the near plane(z=nz = n) unchanged but squishes other points so that they share the same xx and yy as their β€œcomplement” in the near plane(z=nz=n).
^d2acd0

After squishing:

  • 🟒 shares the same xx and yy as 🟑
  • πŸ”΅ shares the same xx and yy as πŸ”΄

We can now start to establish the relationship here! See the point after squishing, there is always a point in the near plane having the same value as yβ€²y'. In the meantime, this 🟀point is related to the point before squishing.

Lemma 2

==The value yy and yβ€²y' are proportional to the zz value==! (similar triangles)

See the diagram below.

Hence, for arbitrary xx and yy, we can derive:

yβ€²n=yzβ€…β€ŠβŸΉβ€…β€Šyβ€²=nyzxβ€²n=xzβ€…β€ŠβŸΉβ€…β€Šxβ€²=nxz\begin{align} \frac{y'}{n} &= \frac{y}{z} \implies y' = \frac{ny}{z} \\ \frac{x'}{n} &= \frac{x}{z} \implies x' = \frac{nx}{z} \end{align}

Thus, the matrix multiplication can be expressed as:

[a1a2a3a4b1b2b3b4c1c2c3c4d1d2d3d4][xyz1]=[xβ€²yβ€²zβ€²1]=[nxznyz?1]\begin{align} \begin{bmatrix} a_1 & a_2 & a_3 & a_4 \\ b_1 & b_2 & b_3 & b_4 \\ c_1 & c_2 & c_3 & c_4 \\ d_1 & d_2 & d_3 & d_4 \end{bmatrix} \begin{bmatrix} x\\y\\z\\1 \end{bmatrix} &= \begin{bmatrix}x'\\y'\\z'\\1\end{bmatrix} \\ &=\begin{bmatrix}\frac{nx}{z}\\\frac{ny}{z}\\?\\1\end{bmatrix} \end{align}

Remark

Note that

  1. the 3rd value zβ€²z' is still unknown.
  2. the 4th value is 11 as the output is still a point.

According to the properties in homogeneous coordinates, we have:

Lemma 3

In homogenous coordinates, multiplying a nonzero scalar kk to a point is still that point. Meaning, p1=(x1,y1,z1,w1)T\mathbf{p}_1 = (x_1, y_1, z_1, w_1)^T and p2=k(x2,y2,z2,w2)T\mathbf{p}_2 = k(x_2, y_2, z_2, w_2)^T are considered equivalent.

We then can multiply the output vector [nxznyz?1]\displaystyle \begin{bmatrix}\frac{nx}{z}\\\frac{ny}{z}\\?\\1\end{bmatrix} by zz:

[nxznyz?1]=[nxny?z](theΒ sameΒ point)\begin{bmatrix}\frac{nx}{z}\\\frac{ny}{z}\\?\\1\end{bmatrix} = \begin{bmatrix}{nx}\\ny\\?\\z\end{bmatrix} \qquad (\text{the same point})

,and it is still the same point!

Now, the matrix multiplication becomes:

[a1a2a3a4b1b2b3b4c1c2c3c4d1d2d3d4][xyz1]=[nxznyz?1]=[nxny?z]x[a1b1c1d1]+y[a2b2c2d2]+z[a3b3c3d3]+1[a4b4c4d4]=[nxny?z]\begin{align} \begin{bmatrix} a_1 & a_2 & a_3 & a_4 \\ b_1 & b_2 & b_3 & b_4 \\ c_1 & c_2 & c_3 & c_4 \\ d_1 & d_2 & d_3 & d_4 \end{bmatrix} \begin{bmatrix} x\\y\\z\\1 \end{bmatrix} &=\begin{bmatrix}\frac{nx}{z}\\\frac{ny}{z}\\?\\1\end{bmatrix} \\ &=\begin{bmatrix}{nx}\\ny\\?\\z\end{bmatrix} \\ x \begin{bmatrix} a_1 \\ b_1 \\ c_1 \\ d_1 \end{bmatrix} + y \begin{bmatrix} a_2 \\ b_2 \\ c_2 \\ d_2 \end{bmatrix} + z \begin{bmatrix} a_3 \\ b_3 \\ c_3 \\ d_3 \end{bmatrix} + 1 \begin{bmatrix} a_4 \\ b_4 \\ c_4 \\ d_4 \end{bmatrix} &=\begin{bmatrix}{nx}\\ny\\?\\z\end{bmatrix} \end{align}

By examining each row:

  • βœ…1st row: a1x+a2y+a3z+a4=nxa_1 x + a_2 y + a_3 z + a_4 = nx. We are pretty sure the only possibility is that a1=na_1=n where others are all 00.
  • βœ…2nd row: b1x+b2y+b3z+b4=nyb_1 x + b_2 y + b_3 z + b_4=ny. We are also pretty sure the only possibility is that b2=nb_2 = n where others are all 00.
  • ❌3rd row: we don’t have enough information to decode this row.
  • βœ…4th row: d1x+d2y+d3z+d4=zd_1 x + d_2 y + d_3 z + d_4 = z. We are pretty sure that the d3=1d_3=1 where others are all 00.

Wonderful! So far, the matrix looks like this:

[n0000n00c1c2c3c40010][xyz1]=[nxny?z]\begin{bmatrix} n & 0 & 0 & 0\\ 0 & n & 0 & 0\\ c_1 & c_2 & c_3 & c_4\\ 0 & 0 & 1 & 0\\ \end{bmatrix} \begin{bmatrix} x\\y\\z\\1 \end{bmatrix} = \begin{bmatrix}{nx}\\ny\\?\\z\end{bmatrix}

πŸ”Observation 2️⃣: Points Don’t Move

From the last observation, we are only left with the 3rd row unknown. Apparently, this row is related to the zz value of a point. Recalling the lemma1 that we notice 2 kinds of special points don’t move after transformation:

  1. 🟑points on the near plane: after the squishing, the points on the near plane stick to the original positions.
  2. πŸ”΄the center point of the far plane: other points in this plane got squished towards center while only this point snaps to the original position.

We can take this observation algebraically.

For any 🟑 points in the near plane which don’t change, we have

  • \displaystyle \begin{align}\begin{bmatrix}n & 0 & 0 & 0\\0 & n & 0 & 0\\c_1&c_2&c_3&c_4\\0 & 0 & 1 & 0\\\end{bmatrix}\begin{bmatrix}x\\y\\\textcolor{orange}{n}\\1\end{bmatrix}&=\begin{bmatrix}x\\y\\\textcolor{orange}{n}\\1\end{bmatrix}\end{align}
  • See the point in homogeneous coordinates(multiply nn), we then have
  • \displaystyle \begin{align}\begin{bmatrix}n & 0 & 0 & 0\\0 & n & 0 & 0\\c_1&c_2&c_3&c_4\\0 & 0 & 1 & 0\\\end{bmatrix}\begin{bmatrix}x\\y\\\textcolor{orange}{n}\\1\end{bmatrix}&=\begin{bmatrix}nx\\ny\\\textcolor{orange}{n^2}\\n\end{bmatrix}\end{align}
  • Look at the 3rd row, we got
  • c1x+c2y+c3n+c4=n2\displaystyle c_1x + c_2y + c_3n + c_4= n^2

For the πŸ”΄ center point in the far plane which don’t change, we have

  • \displaystyle \begin{align}\begin{bmatrix}n & 0 & 0 & 0\\0 & n & 0 & 0\\c_1 & c_2 & c_3 & c_4\\0 & 0 & 1 & 0\\\end{bmatrix}\begin{bmatrix}0\\0\\\textcolor{orange}{f}\\1\end{bmatrix}&=\begin{bmatrix}0\\0\\\textcolor{orange}{f}\\1\end{bmatrix}\end{align}
  • See the point in homogeneous coordinates(multiply ff), we then have
  • \displaystyle \begin{align}\begin{bmatrix}n & 0 & 0 & 0\\0 & n & 0 & 0\\ c_1 & c_2 & c_3 & c_4\\0 & 0 & 1 & 0\\\end{bmatrix}\begin{bmatrix} 0\\0\\\textcolor{orange}{f}\\1\end{bmatrix}&=\begin{bmatrix} 0\\0\\\textcolor{orange}{f^2}\\f\end{bmatrix}\end{align}
  • Look at the 3rd row, we got
  • c1x+c2y+c3f+c4=f2\displaystyle c_1x + c_2y + c_3f + c_4= f^2

Now let’s put them together

c1x+c2y+c3n+c4=n2c1x+c2y+c3f+c4=f2\begin{align} c_1x + c_2y + c_3n + c_4 &= n^2 \\ c_1x + c_2y + c_3f + c_4 &= f^2 \\ \end{align}

Seeing the right hand sides are n2n^2 and f2f^2, we are very sure about that the c1c_1 and c2c_2 must be 00 since we don’t want anything related to xx and yy involved. This simplifies the equations to:

c3n+c4=n2c3f+c4=f2\begin{align} c_3n + c_4 &= n^2 \\ c_3f + c_4 &= f^2 \\ \end{align}

Boom🀯! This is a system of 2 equations that there are 2 unknown and 2 equations. We can easily solve this and got

c3=n+fc4=βˆ’fn.\begin{align} c_3 &= n+f \\ c_4 &= -fn. \\ \end{align}

Finally, we solve this perspective matrix:

P=[n0000n0000n+fβˆ’fn0010].P = \begin{bmatrix} n & 0 & 0 & 0\\ 0 & n & 0 & 0\\ 0 & 0 & n+f & -fn\\ 0 & 0 & 1 & 0\\ \end{bmatrix}.

πŸ‘€See Also

For more on transformations, check out these videos: