In Computer Graphics , the perspective matrix is a key tool that transforms the view frustum into a rectangular cuboid like so.
(redraw this diagram in light of Β©οΈ πFundamentals of Computer Graphics )
NoteThis post focuses on perspective matrix. For a broader understanding of viewing, camera, projection, and viewport transformations, check out πΊViewing Transformation: from Camera to Screen.
πDefinition
The perspective matrix is defined as:
where
- is the distance to the near plane, and
- is the distance to the far plane.
π£Step by Step Explanation
Letβs start from scratch, assuming we donβt know the matrix yet. All we know is
- the matrix is 4x4
- the input is a point, therefore a 4-vector in homogenous coordinates
- the output is also a point, 4-vector
Our goal is to find the relationships between the input and output to determine the perspective matrix.
πObservation 1οΈβ£: Proportional
We start with a key observation:
Lemma 1
The perspective matrix leaves points on the near plane() unchanged but squishes other points so that they share the same and as their βcomplementβ in the near plane().
^d2acd0
After squishing:
- π’ shares the same and as π‘
- π΅ shares the same and as π΄
We can now start to establish the relationship here! See the point after squishing, there is always a point in the near plane having the same value as . In the meantime, this π€point is related to the point before squishing.
Lemma 2
==The value and are proportional to the value==! (similar triangles)
See the diagram below.
Hence, for arbitrary and , we can derive:
Thus, the matrix multiplication can be expressed as:
Remark
Note that
- the 3rd value is still unknown.
- the 4th value is as the output is still a point.
According to the properties in homogeneous coordinates, we have:
Lemma 3
In homogenous coordinates, multiplying a nonzero scalar to a point is still that point. Meaning, and are considered equivalent.
We then can multiply the output vector by :
,and it is still the same point!
Now, the matrix multiplication becomes:
By examining each row:
- β 1st row: . We are pretty sure the only possibility is that where others are all .
- β 2nd row: . We are also pretty sure the only possibility is that where others are all .
- β3rd row: we donβt have enough information to decode this row.
- β 4th row: . We are pretty sure that the where others are all .
Wonderful! So far, the matrix looks like this:
πObservation 2οΈβ£: Points Donβt Move
From the last observation, we are only left with the 3rd row unknown. Apparently, this row is related to the value of a point. Recalling the lemma1
that we notice 2 kinds of special points donβt move after transformation:
- π‘points on the near plane: after the squishing, the points on the near plane stick to the original positions.
- π΄the center point of the far plane: other points in this plane got squished towards center while only this point snaps to the original position.
We can take this observation algebraically.
For any π‘ points in the near plane which donβt change, we have
- \displaystyle \begin{align}\begin{bmatrix}n & 0 & 0 & 0\\0 & n & 0 & 0\\c_1&c_2&c_3&c_4\\0 & 0 & 1 & 0\\\end{bmatrix}\begin{bmatrix}x\\y\\\textcolor{orange}{n}\\1\end{bmatrix}&=\begin{bmatrix}x\\y\\\textcolor{orange}{n}\\1\end{bmatrix}\end{align}
- See the point in homogeneous coordinates(multiply ), we then have
- \displaystyle \begin{align}\begin{bmatrix}n & 0 & 0 & 0\\0 & n & 0 & 0\\c_1&c_2&c_3&c_4\\0 & 0 & 1 & 0\\\end{bmatrix}\begin{bmatrix}x\\y\\\textcolor{orange}{n}\\1\end{bmatrix}&=\begin{bmatrix}nx\\ny\\\textcolor{orange}{n^2}\\n\end{bmatrix}\end{align}
- Look at the 3rd row, we got
For the π΄ center point in the far plane which donβt change, we have
- \displaystyle \begin{align}\begin{bmatrix}n & 0 & 0 & 0\\0 & n & 0 & 0\\c_1 & c_2 & c_3 & c_4\\0 & 0 & 1 & 0\\\end{bmatrix}\begin{bmatrix}0\\0\\\textcolor{orange}{f}\\1\end{bmatrix}&=\begin{bmatrix}0\\0\\\textcolor{orange}{f}\\1\end{bmatrix}\end{align}
- See the point in homogeneous coordinates(multiply ), we then have
- \displaystyle \begin{align}\begin{bmatrix}n & 0 & 0 & 0\\0 & n & 0 & 0\\ c_1 & c_2 & c_3 & c_4\\0 & 0 & 1 & 0\\\end{bmatrix}\begin{bmatrix} 0\\0\\\textcolor{orange}{f}\\1\end{bmatrix}&=\begin{bmatrix} 0\\0\\\textcolor{orange}{f^2}\\f\end{bmatrix}\end{align}
- Look at the 3rd row, we got
Now letβs put them together
Seeing the right hand sides are and , we are very sure about that the and must be since we donβt want anything related to and involved. This simplifies the equations to:
Boomπ€―! This is a system of 2 equations that there are 2 unknown and 2 equations. We can easily solve this and got
Finally, we solve this perspective matrix:
πSee Also
For more on transformations, check out these videos: