DirectX and Matrices

Fundamentals

In DirectX, in most cases a matrix is a 4×4 row-major matrix with float components that can be visualized as a table:
A 4x4 matrix
Note that matrices are column-major in OpenGL. While the general content of this article also applies to OpenGL, both vectors and matrices are transposed, which results in a reversed multiplication order. Keep that in mind when you work with OpenGL.
Matrices can be used to transform vectors. More details on this are given in the next chapter. In the above matrix three areas are marked:

  • The blue area
    describes linear transformations. Linear transformations preserve the origin. Examples are scalings, rotations and shearings.
  • The green area
    describes affine transformations, i.e. translations.
  • The red area
    describes perspectival transformations. This area does not play a big role for world and view matrices.

To be exact, it is worth mentioning that every linear transformation is an affine transformation. So, the green area describes those affine transformations that are not linear.
In DirectX, vectors are row-vectors. They can be transformed by multiplying it to the left side of a matrix. Here is the definition of this multiplication. For reasons of clarity I wrote the result as a transposed (superscript T) column-vector. The transposition means that rows and columns are swapped. So the result of the multiplication is again a row-vector:
Vector Matrix Multiplication
Let’s have a closer look at the result. The first thing that attracts our attention is the fact that the vector has got a new component. This last component is the w-component and is used for perspectival transformations. Vectors describing a position have a 1 in this component; vectors describing directions or normals have a 0. After transformation, the vector is multiplied with the inverse of its w-component. Afterwards it can be removed. This process is called w-clipping.
Additionally, we can see that the result is a sum of products. The first component is the sum of the componentwise product of the matrix’ first column and the input vector. Because the w-component is 1 for coordinates, the translation (fourth row) is just added to the vector. This is reasonable because a translation can be expressed by the addition with the translation vector. Directions should not be affected by translations. And indeed, due to their zero w-component, translation parts of the matrix have no effect on direction vectors or normals.

Combining transformations

Transformations that are described by matrices can be combined by multiplying them. The order of this is not arbitrary (matrix multiplication is not commutative). In the following examples, T(v) represents a translation by vector v and RZ(x) represents a rotation about the z-axis by angle x.
Consider the following transformation M:
Sample Transformation
This transform can be interpreted in two ways:

  1. From left to right:
    Combining transforms from left to right leads to transforms being executed with respect to the global coordinate system.
  2. From right to left:
    If we combine the matrices in the opposite direction, the coordinate system will be transformed, too. That’s why all transformations will be executed in a distinct local coordinate system.

Of course, the result is the same. The following figure explains the difference in detail:
Interpretation left to right

Every transformation is executed in the global coordinate system. Rotate around the point of origin.

Interpretation right to left

Every transform transforms the local coordinate system:

  • After the first translation the local point of origin is at (0,-2,0)
  • After the rotation the local coordinate system is rotated (x axis points upwards, y axis points right)

Grouping objects (Scene Graphs)

Transforms can be used to group separate objects. Consider the following group:

  • Root
    • Triangle
    • Transform: T(-2,0,0)
      • Transform: RZ(45°)
        • Square

So, with no other transforms applied, the group looks like this:

The interpretation from right to left (including the transformation of local coordinate systems) will help us most. Climbing down the tree, if we reach a transform, it has to be added to the left side of the transform matrix that already exists, so the new matrix can affect the coordinate system.
So, the overall matrix for the triangle would be the identity matrix (there are no transforms applied for this one). The transform for the square will be: RZ(45°) * T(-2,0,0).
If we want to transform the whole group, the new matrix has to be multiplied to the right side of the special matrices, because it is supposed to transform the local coordinate systems, too. Of course, the additional matrix can be created by either the left-to-right or right-to-left interpretation.
Consider that the whole group is to be translated by 2 units downwards and afterwards rotated by 45° in clockwise direction. The group transform would result in RZ(-45°) * T(0,-2,0). Because there are no additional transforms stored for the triangle, this is the overall transform for this geometry, too. The overall transform for the square will be RZ(45°) * T(-2,0,0) * RZ(-45°) * T(0,-2,0). This will result in the following image:

Summing up, in order to calculate the overall transform of an object, each separate transform has to be multiplied from left to right beginning with the most special one for the object.
Let’s end with an illustration of the square’s transformation for both interpretations:
Interpretation left to right:

Interpretation right to left:

Advertisements

, ,

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: