Confusion about how inverse bind pose is actually calculated and used?

Question

I am trying to do skeletal animation using Assimp and the inverse bind pose matrix just trips me up. I will give a little example to illustrate my point.

Root
    Bone A
        Bone B
            Bone C

Let's say we have a hierarchy like above. To get a vertex in root space into C space, I would have to do (CBA) * v. The inverse bind pose, which is supposed to get a vertex from bone space to root space, should be the inverse of this. Therefore, the way to calculate inverse bind matrix should be (CBA)^-1 or (A^-1)(B^-1)(C^-1). The formula then is (A^-1)(B^-1)(C^-1) * v, which makes sense. However, I am not able to reproduce the result of Assimp's mOffsetMatrix (inverse bind matrix caculated by Assimp) with (CBA)^-1. (ABC)^-1, however, surprisingly produces the correct result, albeit a little marginal error here and there which I attribute to floating point error.

score 2 · Accepted Answer · answered May 09 '18 at 15:58

2

To get a vertex in root space into C space, I would have to do (CBA) * v.

Well, yes. But that's not actually what you want to do in skeletal animation. You have it reverse.

It's the other way around. You assume you have a point in C space and want to compute its position in root/global space, in order to pump it further through the transformation pipeline, which gets increasingly more global with each transformation until your point ends up in eye-space (and further). That's why A doesn't transform from root space into A space, it transforms from A space into root space. So $ABC$ transforms from C space into root space.

Now your mesh vertices already are in root space, that's where the inverse bind pose comes into play, which transforms a point from root space into C space. So you basically transform your point from root space to C (or whatever joint) space with the inverse bind matrix that's only dependent on the joint and constant over the whole animation and then from that joint space back into root space with the animation-dependent joint transformation.

So C's transformation matrix for frame $t$, that transforms a vertex from unanimated root space, which is how your mesh is defined, into animated (with respect to C's joint angle) root space would be $(A_RB_RC_R)^{-1}A(t)B(t)C(t)$, where $M(t)$ is the local joint matrix at animation time $t$ and $M_R$ is the local joint matrix corresponding to the rest pose, the pose in which the mesh is defined, and thus $(A_RB_RC_R)^{-1}$ is C's inverse bind pose.

answered May 09 '18 at 15:58

Christian Rau

1,604
11
32

Are you sure that the inverse bind pose (or mOffsetMatrix) transforms a point from root space to bone space? Because I have read conflicting opinions on that matter https://github.com/assimp/assimp/pull/1803 – Manh Nguyen May 09 '18 at 19:38
That's what it's *supposed* to do. But I don't know how Assimp handles it specifically. Maybe there's some confusion over the matrix vs its inverse? – Christian Rau May 09 '18 at 19:58
Thanks for the prompt reply. So basically, I need to transform a vertex from mesh space to world space, then use inverseBindTransform to get it from world space to bone space, where I would multiply it with the desired animation pose, then I would again transform that into world space again right? – Manh Nguyen May 09 '18 at 20:04
Not exactly. Your vertex is in mesh space. You transform it from mesh space (let's say that's also root space) into local joint/bone space with the inverse bind matrix, then from local joint space back into root/mesh space, but this time with the actual animation matrix. Usually that whole thing isn't really applied in order like this but instead builds a single matrix, the one transforming a vertex for *one* joint from root space into animated root space. And those are the joint matrices you combine with usual blend skinning (or other method) to get the matrix you transform your vertex with. – Christian Rau May 09 '18 at 20:16
And this animated vertex, now still in mesh/root space, but properly animated, you then transform with the normal model-to-world and world-to-camera or whatever global transformation matrices. – Christian Rau May 09 '18 at 20:18
I dont think I can directly transform it from mesh space to bone space. The way I am calculating inverse bind, I concatenate every transformation matrix starting from root node, so its actually in world space. So I need to move vertex from mesh space to world space before I can multiply it with inverse bind transform to bring it to bone space, right? The way I am doing is (CBA)^-1 but it is wrong somehow – Manh Nguyen May 10 '18 at 03:10
Well, then from world space. The point is that you use the same matrices for the inverse bind matrix that you use for the actual animation, just inverse and for the animation frame that the mesh is defined in. But it's also really hard to argue and figure out what actual multiplication order you should use in a comment thread. I gave the general idea how it works above, the rest are technicalities and maybe a little more reading of introductory material how skeletal animation works in general. You just always need to be aware in which space your data is defined and in which space you want it. – Christian Rau May 10 '18 at 13:08

Confusion about how inverse bind pose is actually calculated and used?

1 Answers1