In DirectX, in most cases a matrix is a 4×4 row-major matrix with float components that can be visualized as a table:
Note that matrices are column-major in OpenGL. While the general content of this article also applies to OpenGL, both vectors and matrices are transposed, which results in a reversed multiplication order. Keep that in mind when you work with OpenGL.
Matrices can be used to transform vectors. More details on this are given in the next chapter. In the above matrix three areas are marked:
- The blue area
describes linear transformations. Linear transformations preserve the origin. Examples are scalings, rotations and shearings.
- The green area
describes affine transformations, i.e. translations.
- The red area
describes perspectival transformations. This area does not play a big role for world and view matrices.
To be exact, it is worth mentioning that every linear transformation is an affine transformation. So, the green area describes those affine transformations that are not linear.
In DirectX, vectors are row-vectors. They can be transformed by multiplying it to the left side of a matrix. Here is the definition of this multiplication. For reasons of clarity I wrote the result as a transposed (superscript T) column-vector. The transposition means that rows and columns are swapped. So the result of the multiplication is again a row-vector:
Let’s have a closer look at the result. The first thing that attracts our attention is the fact that the vector has got a new component. This last component is the w-component and is used for perspectival transformations. Vectors describing a position have a 1 in this component; vectors describing directions or normals have a 0. After transformation, the vector is multiplied with the inverse of its w-component. Afterwards it can be removed. This process is called w-clipping.
Additionally, we can see that the result is a sum of products. The first component is the sum of the componentwise product of the matrix’ first column and the input vector. Because the w-component is 1 for coordinates, the translation (fourth row) is just added to the vector. This is reasonable because a translation can be expressed by the addition with the translation vector. Directions should not be affected by translations. And indeed, due to their zero w-component, translation parts of the matrix have no effect on direction vectors or normals.
Transformations that are described by matrices can be combined by multiplying them. The order of this is not arbitrary (matrix multiplication is not commutative). In the following examples, T(v) represents a translation by vector v and RZ(x) represents a rotation about the z-axis by angle x.
Consider the following transformation M:
This transform can be interpreted in two ways:
- From left to right:
Combining transforms from left to right leads to transforms being executed with respect to the global coordinate system.
- From right to left:
If we combine the matrices in the opposite direction, the coordinate system will be transformed, too. That’s why all transformations will be executed in a distinct local coordinate system.
Of course, the result is the same. The following figure explains the difference in detail:
Interpretation left to right
Every transformation is executed in the global coordinate system. Rotate around the point of origin.
- After the first translation the local point of origin is at (0,-2,0)
- After the rotation the local coordinate system is rotated (x axis points upwards, y axis points right)
Grouping objects (Scene Graphs)
Transforms can be used to group separate objects. Consider the following group:
- Transform: T(-2,0,0)
- Transform: RZ(45°)
- Transform: RZ(45°)
So, with no other transforms applied, the group looks like this:
The interpretation from right to left (including the transformation of local coordinate systems) will help us most. Climbing down the tree, if we reach a transform, it has to be added to the left side of the transform matrix that already exists, so the new matrix can affect the coordinate system.
So, the overall matrix for the triangle would be the identity matrix (there are no transforms applied for this one). The transform for the square will be: RZ(45°) * T(-2,0,0).
If we want to transform the whole group, the new matrix has to be multiplied to the right side of the special matrices, because it is supposed to transform the local coordinate systems, too. Of course, the additional matrix can be created by either the left-to-right or right-to-left interpretation.
Consider that the whole group is to be translated by 2 units downwards and afterwards rotated by 45° in clockwise direction. The group transform would result in RZ(-45°) * T(0,-2,0). Because there are no additional transforms stored for the triangle, this is the overall transform for this geometry, too. The overall transform for the square will be RZ(45°) * T(-2,0,0) * RZ(-45°) * T(0,-2,0). This will result in the following image:
Summing up, in order to calculate the overall transform of an object, each separate transform has to be multiplied from left to right beginning with the most special one for the object.
Let’s end with an illustration of the square’s transformation for both interpretations:
Interpretation left to right:
Interpretation right to left: