Chapter 15 - Animations
Until now we have only loaded static 3D models, but in this chapter we will learn how to animate them. When thinking about animations the first approach is to create different meshes for each model positions, load them up into the GPU and draw them sequentially to create the illusion of movement. Although this approach is perfect for some games, it's not very efficient in terms of memory consumption. This where skeletal animation comes to play. We will learn how to load these models using assimp.
You can find the complete source code for this chapter here.
Anti-aliasing support
In this chapter we will add also support for anti-aliasing. Up to this moment you may have seen saw-like edges in the models. In order to remove those effects, we will apply anti-aliasing which basically uses the values of several samples to construct the final value for each pixel. In our case, we will use four sampled values. We need to set up this as a window hint prior to image creation (and add a new window option to control that):
public class Window {
...
public Window(String title, WindowOptions opts, Callable<Void> resizeFunc) {
...
if (opts.antiAliasing) {
glfwWindowHint(GLFW_SAMPLES, 4);
}
glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3);
glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 2);
...
}
...
public static class WindowOptions {
public boolean antiAliasing;
...
}
}In the Render class we need to enable multi-sampling (in addition to that, we remove face culling to properly render the sample model):
Introduction
In skeletal animation the way a model animates is defined by its underlying skeleton. A skeleton is defined by a hierarchy of special elements called bones. These bones are defined by their position and rotation. We have said also that it's a hierarchy, which means that the final position for each bones is affected by the position of their parents. For instance, think of a wrist: the position of a wrist is modified if a character moves the elbow and also if it moves the shoulder.
Bones do not need to represent a physical bone or articulation: they are artifacts that allow the creatives to model an animation. In addition to bones we still have vertices, the points that define the triangles that compose a 3D model. But in skeletal animation, vertices are drawn based on the position of the bones they relate to.
In this chapter I’ve consulted many different sources, but I have found two that provide a very good explanation about how to create an animated model. Theses sources can be consulted at:
If you load a model which contains animations with current code, you will get what is called the binding pose. You can try that (with code from previous chapter) and you will be able to see the 3D model perfectly. The binding pose defines the positions, normals, and texture coordinates of the model without being affected by animation at all. An animated model defines in essence the following additional information:
A tree like structure, composed by bones, which define a hierarchy where we can compose transformations.
Each mesh, besides containing information about vertex position, normals, etc, will include information about which bones does this vertex relate to (by using a bone index) and how much they are affected (that is modulating the effect by using a weight factor).
A set of animation key frames which define the specific transformations that should be applied to each bone and by extension wil modify the associated vertices. A model can define several animations and each of them may be composed of several animation key frames. When animation we iterate over those key frames (which define a duration) and we can even interoperate between them. In essence, for a specific instant of time we are applying to each vertex the transformations associated to the related bones.
Let’s review first the structures handled by assimp that contain animation information. We will start first with the bones and weights information. For each AIMesh, we can access the vertices positions, texture coordinates and indices. Meshes store also a list of bones. Each bone is defined by the following attributes:
A name.
An offset matrix: This will used later to compute the final transformations that should be used by each bone.
Bones also point to a list of weights. Each weight is defined by the following attributes:
A weight factor, that is, the number that will be used to modulate the influence of the bone’s transformation associated to each vertex.
A vertex identifier, that is, the vertex associated to the current bone.
The following picture shows the relationships between all these elements.

Therefore, each vertex, besides containing position, normals and texture coordinates will have now a set of indices (typically four values) of the bones that affect those vertices (jointIndices) and a set of weights that will modulate that effect. Each vertex will bve modified according to the transformation matrices associated to each joint in order to calculate final position. Therefore, we will need to augment the VAO associated to each mesh to hold that information as it is shown in the next figure.

Assimp scene object defines a Node's hierarchy. Each Node is defined by a name and a list of children nodes. Animations use these nodes to define the transformations that should be applied. This hierarchy defined is indeed the bones’ hierarchy. Every bone is a node, and has a parent, except the root node, and possible a set of children. There are special nodes that are not bones; they are used to group transformations, and should be handled when calculating the transformations. Another issue is that this Node's hierarchy is defined from the whole model: we do not have separate hierarchies for each mesh.
A scene also defines a set of animations. A single model can have more than one animation to model how a character walks, runs, etc. Each of these animations defines different transformations. An animation has the following attributes:
A name.
A duration. That is, the duration in time of the animation. The name may seem confusing since an animation is the list of transformations that should be applied to each node for each different frame.
A list of animation channels. An animation channel contains, for a specific instant in time, the translation, rotation and scaling information that should be applied to each node. The class that models the data contained in the animation channels is the
AINodeAnim. Animation channels could be assimilated as the key frames.
The following figure shows the relationships between all the elements described above.

For a specific instant of time, for a frame, the transformation to be applied to a bone is the transformation defined in the animation channel for that instant, multiplied by the transformations of all the parent nodes up to the root node. Hence, we need to extract the information stored in the scene, the process is as follows:
Construct the node hierarchy.
For each animation, iterate over each animation channel (for each animation node) and construct the transformation matrices for each of the bones for all the potential animation frames. Those transformation matrices are a combination of the transformation matrix of the node associated to the bone and the bone transformation matrices.
We start at the root node, and for each frame, build the transformation matrix for that node, which is the transformation matrix of the node multiplied by the composition of the translation, rotation and scale matrix of that specific frame for that node.
We then get the bones associated to that node and complement that transformation by multiplying the offset matrices of the bones. The result will be a transformation matrix associated with the related bones for that specific frame, which will be used in the shaders.
After that, we iterate over the children nodes, passing the transformation matrix of the parent node to also be used in combination with the children node transformations.
Implementation
Let's start by analyzing the changes in the ModelLoader class:
We need an extra argument (named animation) in the loadModel method to indicate whether we are loading a model with animations or not. If so, we cannot use the aiProcess_PreTransformVertices flag. This flag performs some transformation over the data loaded so the model is placed in the origin and the coordinates are corrected to math OpenGL coordinate System. We cannot use this flag for animated models because it removes animation data information.
While processing the meshes, we will also process the associated bones and weights for each vertex, storing the list of bones so we can later on build the required transformations:
The new method processBones is defined like this:
This method traverses the bone definition for a specific mesh, getting their weights and filling up three lists:
boneList: It contains a list of bones, with their offset matrices. It will be used later on to calculate the final bones transformations. A new class namedBonehas been created to hold that information. This list will contain the bones for all the meshes.boneIds: It contains just the identifiers of the bones for each vertex of theMesh. Bones are identified by its position when rendering. This list only contains the bones for a specific Mesh.weights: It contains the weights for each vertex of theMeshto be applied for the associated bones.
The information retrieved in this method is encapsulated in the AnimMeshData record (defined inside the ModelLoader class). The new Bone and VertexWeight classes are also records. They are defined like this:
We also have created two new methods in the Utils class to transform List of floats and ints to an array:
Going back to the loadModel method, when we have processed the meshes and the materials we will process the animation data (that is the different animation key frames associated to each animation and their transformations. All that information is also stored in the Model class:
The buildNodesTree method is quite simple, It just traverses the node's hierarchy starting from the root node, constructing a tree of nodes:
The toMatrix method just transforms an assimp matrix to a JOML one:
The processAnimations method is defined like this:
This method returns a List of Model.Animation instances. Remember that a model can have more than one animation, so they are stored by their index. For each of these animations, we construct a list of animation frames (Model.AnimatedFrame instances), which are essentially a list of the transformation matrices to be applied to each of the bones that compose the model. For each animation, we calculate the maximum number of frames by calling the method calcAnimationMaxFrames, which is defined like this:
Before continuing to review the changes in the ModelLoader class, let's review the changes in the Model class to hold animation information:
As you can see, we store the list of animations associated to the model, each animation defined by a name, a duration and a list of animation frames, which in essence just stores the bone transformation matrices to be applied for each bone.
Back to the ModelLoader class, each AINodeAnim instance defines some transformations to be applied to a node in the model for a specific frame. These transformations, for a specific node, are defined in the AINodeAnim instance. These transformations are defined in the form of position translations, rotations and scaling values. The trick here is that, for example, for a specific node, translation values can stop at a specific frame, but rotations and scaling values can continue for the next frames. In this case, we will have less translation values than rotation or scaling ones. Therefore, a good approximation to calculate the maximum number of frames is to use the maximum value. The problem gets more complex, because this is defined per node. A node can just define some transformations for the first frames and not apply more modifications for the rest. In this case, we should always use the last defined values. Therefore, we get the maximum number for all the animations associated to the nodes.
Going back to the processAnimations method, with that information, we are ready to iterate over the different frames and build the transformation matrices for the bones by calling the buildFrameMatrices method. For each frame, we start with the root node, and will apply the transformations recursively from top to bottom of the nodes hierarchy. The buildFrameMatrices is defined like this:
We get the transformation associated to the node. Then we check if this node has an animation node associated to it. If so, we need to get the proper translation, rotation and scaling transformations that apply to the frame we are handling. With that information, we get the bones associated to that node and update the transformation matrix for each of those bones for that specific frame by multiplying:
The model inverse global transformation matrix (the inverse of the root node transformation matrix).
The transformation matrix for the node.
The bone offset matrix.
After that, we iterate over the children nodes, using the node transformation matrix as the parent matrix for those child nodes.
The AINodeAnim instance defines a set of keys that contain translation, rotation and scaling information. These keys are referred to specific instants of time. We assume that information is ordered by time, and construct a list of matrices that contain the transformation to be applied for each frame. As said before, some of those transformations may "stop" at a specific frame, so we should use the last values for the last of the frames.
The findAIAnimNode method is defined like this:
The Mesh class needs to be updated to allocate the new VBOs for bone indices and bone weights. You will see that we use a maximum of four weights (and the associated bone indices per vertex)
The Node class just stores the data associated to an AINode and has specific methods to manage its children:
Now we can view how we render animated models and how they can coexist with static ones. Let's start with the SceneRender class. In this class we just need to set up a new uniform to pass the bones matrices (assigned to current animation frame) so they can be used in the shader. Besides that, the render of static and animated entities do not have any additional impact over this class.
For static models, we will pass an array of matrices set to null. We also need to modify the UniformsMap to add a new method to set up the values for an array of matrices:
We also have created a new class named AnimationData to control the current animation set to an Entity:
An of course, we need to modify the Entity class to hold a reference to the AnimationData instance:
We need to modify the scene vertex shader (scene.vert) to put into play animation data. We start by defining some constants and the new input attributes for bone weights and indices (we are using four elements per vertex so we use vec4 and ivec4). We also pass the bone matrices associated to current animation as a uniform.
In the main function we will iterate over the bone weights and modify the position and normals using the matrices designated by the associated bone indices and modulated by the associated weights. You can think about it as if each bone would contribute to position (and normals) modification but modulated by using the weights. If we are using static models, the weights would be zero so we will stick to original position and normals values.
The following figure depicts the process.

In the Main class we need to load animation models and activate anti-aliasing. We will also increment the animation frame each update:
Finally, we also need to modify the SkyBox class since the loadModel method from the ModelLoader class has changes:
You will be able to see something like this:

Last updated