Setting Bones From Matrix

If you are a new Irrlicht Engine user, and have a newbie-question, this is the forum for you. You may also post general programming questions here.
kklouzal
Posts: 343
Joined: Sun Mar 28, 2010 8:14 pm
Location: USA - Arizona

Re: Setting Bones From Matrix

Post by kklouzal »

Thank you devsh that article explained things very well. OZZ Animation provides the bind pose, unfortunately the matrices are in a different format than those given when doing an animation. I raised an issue on their github page to figure out how to intermingle the two.

In the meantime I figured I would drop OZZ and go 100% Irrlicht:

Code: Select all

class irrSkinningCallback : public irr::video::IShaderConstantSetCallBack
{
public:
    irr::u32 MaxBones = 70;
    irr::scene::ISkinnedMesh* Mesh;
    irr::f32* Uniforms = new irr::f32[MaxBones * 16];
    irr::f32* Uniforms_Scratch = Uniforms;
    irr::core::matrix4 BoneTranslation;
    bool Update = true;
 
    irrSkinningCallback() : BoneTranslation(irr::core::matrix4::EM4CONST_NOTHING) {}
    ~irrSkinningCallback()
    {
        delete[] Uniforms;
    }
 
    void SetupNode(irr::video::IVideoDriver* driver, irr::scene::IAnimatedMeshSceneNode* Node)
    {
        Mesh = (irr::scene::ISkinnedMesh*)Node->getMesh();
        MaxBones = Mesh->getAllJoints().size();
        Mesh->setHardwareMappingHint(irr::scene::EHM_STATIC, irr::scene::EBT_VERTEX_AND_INDEX);
        Mesh->setHardwareSkinning(true);
        irr::video::IGPUProgrammingServices* gpu = driver->getGPUProgrammingServices();
 
        irr::io::path VertPath = "skinning.vert";
        irr::io::path FragPath = "";
        irr::s32 mtlSkinningShader = gpu->addHighLevelShaderMaterialFromFiles(
            VertPath, "main", irr::video::EVST_VS_4_1,
            FragPath, "main", irr::video::EPST_PS_4_1,
            this, irr::video::EMT_SOLID, 0, irr::video::EGSL_DEFAULT);
 
        Node->setMaterialType((irr::video::E_MATERIAL_TYPE)mtlSkinningShader);
    }
 
    void OnSetConstants(irr::video::IMaterialRendererServices* services, irr::s32 userData)
    {
        if (Update) {
            Update = false;
        }
        else {
            Update = true;
            return;
        }
        Uniforms_Scratch = Uniforms;
 
        for (irr::u32 Bone = 0; Bone < MaxBones; ++Bone)
        {
            BoneTranslation.setbyproduct(
                Mesh->getAllJoints()[Bone]->GlobalAnimatedMatrix,
                Mesh->getAllJoints()[Bone]->GlobalInversedMatrix);
 
            for (irr::u8 Float = 0; Float < 16; ++Float) {
                *Uniforms_Scratch++ = BoneTranslation[Float];
            }
        }
        
        services->setVertexShaderConstant("bones[0]", Uniforms, MaxBones * 16);
    }
};
 
class irrAnimator
{
    irr::scene::ISceneManager* smgr;
    irr::scene::IAnimatedMeshSceneNode* Node;
 
    //  Shader Callback
    irrSkinningCallback* callback;
 
public:
    irr::scene::IAnimatedMeshSceneNode* GetNode() { return Node; }
 
    irrAnimator(irr::video::IVideoDriver* driver, irr::scene::ISceneManager* SceneManager) : smgr(SceneManager)
    {
        callback = new irrSkinningCallback;
        Node = smgr->addAnimatedMeshSceneNode(smgr->getMesh("dwarf.x"));
        Node->setMaterialFlag(irr::video::EMF_LIGHTING, false);
        callback->SetupNode(driver, Node);  //  Apply Skinning Shader
 
        //  Hacky way to set vertex weights
        irr::scene::ISkinnedMesh* SkinnedMesh = (irr::scene::ISkinnedMesh*)Node->getMesh();
 
        for (irr::u32 Buffer = 0; Buffer < SkinnedMesh->getMeshBuffers().size(); ++Buffer)
        {
            for (irr::u32 Vert = 0; Vert < SkinnedMesh->getMeshBuffers()[Buffer]->getVertexCount(); ++Vert)
            {
                SkinnedMesh->getMeshBuffers()[Buffer]->getVertex(Vert)->Color = irr::video::SColor(0, 0, 0, 0);
            }
        }
 
        for (irr::u32 Joint = 0; Joint < SkinnedMesh->getAllJoints().size(); ++Joint)
        {
            for (irr::u32 Weight = 0; Weight < SkinnedMesh->getAllJoints()[Joint]->Weights.size(); ++Weight)
            {
                const irr::u32 buffId = SkinnedMesh->getAllJoints()[Joint]->Weights[Weight].buffer_id;
 
                const irr::u32 vertexId = SkinnedMesh->getAllJoints()[Joint]->Weights[Weight].vertex_id;
                irr::video::SColor* vColor = &SkinnedMesh->getMeshBuffers()[buffId]->getVertex(vertexId)->Color;
 
                if (vColor->getRed() == 0)
                    vColor->setRed(Joint + 1);
                else if (vColor->getGreen() == 0)
                    vColor->setGreen(Joint + 1);
                else if (vColor->getBlue() == 0)
                    vColor->setBlue(Joint + 1);
                else if (vColor->getAlpha() == 0)
                    vColor->setAlpha(Joint + 1);
            }
        }
    }
 
    ~irrAnimator()
    {
        delete callback;
    }
};
And the shader

Code: Select all

uniform mat4 bones[100];
 
void main(void)
{
    int BoneID = int(gl_Color.r * 255);
    mat4 vertTran = bones[BoneID - 1];
    
    BoneID = int(gl_Color.g * 255);
    if(BoneID > 0)
        vertTran += bones[BoneID - 1];
 
    BoneID = int(gl_Color.b * 255);
    if(BoneID > 0)
        vertTran += bones[BoneID - 1];
        
    BoneID = int(gl_Color.a * 255);
    if(BoneID > 0)
        vertTran += bones[BoneID - 1];
    
    gl_Position = gl_ModelViewProjectionMatrix * vertTran * gl_Vertex;
    
    gl_FrontColor = vec4(1,1,1,1);
    gl_TexCoord[0] = gl_MultiTexCoord0;
    gl_TexCoord[1] = gl_MultiTexCoord1;
}
And it works!
https://youtu.be/rDImu6lNlX8
27FPS and 1.3million triangles.
Sadly turning off hardware skinning and removing the shader gives about a 40% speedup.. Doesn't really make any sense, you would think that speed increase would be in the other direction..
Dream Big Or Go Home.
Help Me Help You.
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: Setting Bones From Matrix

Post by devsh »

What GPU do you have?

Putting Bones in a Uniform Array may cause Constant-Waterfalling on pre OpenGL 4.0 GPUs.

Look at IrrlichtBAW's HardwareSkinning example and modify it to have 25x25 dwarves, I get 17fps with lighting (on Nvidia 1060 Mobile).
The problem however is that each node (out of the 625) gets its own TextureBufferObject (Texture+Buffer) which gets updated every frame, so its not fully-fully optimized yet.

Beware of comparing a full-GPU load with a empty-CPU load, this is because hundreds of thousands of vertices skinning on the CPU will be faster than all-GPU because CPU has nothing to do, as soon as you throw some CPU load (game logic, AI or physics) the balance may change.

ANOTHER NOTE: YOUR DWARVES ARE ALL IN SYNC, SO CPU SKINS ONLY 1 MESH PER FRAME AS ALL 625 NODES SHARE THE SAME MESH.

TRY HAVING THE DWARVES ANIMATE AT DIFFERENT SPEEDS!
(once I did my hardware skinning became 10x faster than CPU, especially when the meshes were off-screen out of camera's view)
http://irrlicht.sourceforge.net/forum/v ... es#p299488
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: Setting Bones From Matrix

Post by devsh »

Sorry, I get 135 FPS with 625 dwarves... I had the engine compiled in debug mode XD
kklouzal
Posts: 343
Joined: Sun Mar 28, 2010 8:14 pm
Location: USA - Arizona

Re: Setting Bones From Matrix

Post by kklouzal »

I suppose I was just expecting to render way more than what I got.
This would probably be a scenario where instancing would greatly improve the situation.
https://www.videocardbenchmark.net/gpu. ... 00%2F8700M
Radeon 8600M/8700M, not the best card but it gets the job done.

The next thing I'll do is cut it down to 14x14 (196 nodes) and ensure they only update at a fixed 30fps
Aside from that I'm not sure there is much I could do to increase performance. The shader itself seems pretty basic at this point.
Last edited by kklouzal on Fri Feb 02, 2018 12:51 am, edited 1 time in total.
Dream Big Or Go Home.
Help Me Help You.
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: Setting Bones From Matrix

Post by devsh »

We don't have instancing on our skinned meshes, yet.

With a 8800M GPU the highest FPS you can expect is 18-19
You're getting 27FPS in CPU skinning mode because you're drawing the same mesh in the same pose 625 times.

There is something wrong with your shader code.

Code: Select all

 
    int BoneID = int(gl_Color.r * 255);
    mat4 vertTran = bones[BoneID - 1];
   
    BoneID = int(gl_Color.g * 255);
    if(BoneID > 0)
        vertTran += bones[BoneID - 1];
 
    BoneID = int(gl_Color.b * 255);
    if(BoneID > 0)
        vertTran += bones[BoneID - 1];
       
    BoneID = int(gl_Color.a * 255);
    if(BoneID > 0)
        vertTran += bones[BoneID - 1];
 
These bone matrices are not weighted by the vertex skinning weights.. its a miracle skinning works, it surely won't funking work on other meshes (assumes equal weight on bones affecting vertex).
(Also reason why your OZZ skinning may not work)

One note, instead of accumulating your bone matrices, accumulate vertices transformed by matrices.

//adding transformed weighted vertices is better than adding weighted matrices and then transforming
//averaging matrices = [1,4]*(21 fmads) + 15 fmads
//averaging transformed verts = [1,4]*(15 fmads + 7 muls)


Final remark, use mat4x3 for the bone matrices, you'll save 25% of the Uniform memory and increase your FPS by 10-25%.
kklouzal
Posts: 343
Joined: Sun Mar 28, 2010 8:14 pm
Location: USA - Arizona

Re: Setting Bones From Matrix

Post by kklouzal »

Thank you devsh you have been a big help to me and the rest of the community!
Dream Big Or Go Home.
Help Me Help You.
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: Setting Bones From Matrix

Post by devsh »

kklouzal wrote:The next thing I'll do is cut it down to 14x14 (196 nodes) and ensure they only update at a fixed 30fps
We already have this in IrrBAW

Code: Select all

 
        //! only for EBUM_NONE and EBUM_READ, it dictates what is the actual frequency we want to bother updating the mesh
        //! because we don't want to waste CPU time if we can tolerate the bones updating at 120Hz or similar
        virtual void setDesiredUpdateFrequency(const float& hertz) = 0;
By default its 120hz
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: Setting Bones From Matrix

Post by devsh »

kklouzal wrote:Sadly turning off hardware skinning and removing the shader gives about a 40% speedup.. Doesn't really make any sense, you would think that speed increase would be in the other direction..
Do this to your dwarve nodes, and you'll see this statement is wrong.

Code: Select all

 
#define kInstanceSquareSize 25
 
        for (size_t x=0; x<kInstanceSquareSize; x++)
        for (size_t z=0; z<kInstanceSquareSize; z++)
            dwarfNode[x+kInstanceSquareSize*z] = anode->setAnimationSpeed(18.f*float(x+1+(z+1)*kInstanceSquareSize)/float(kInstanceSquareSize*kInstanceSquareSize));
 
kklouzal
Posts: 343
Joined: Sun Mar 28, 2010 8:14 pm
Location: USA - Arizona

Re: Setting Bones From Matrix

Post by kklouzal »

LOL they are doing the wave!
196 nodes with gpu skinning 70fps
196 nodes with cpu skinning 35fps

Literally a 100% increase in performance with gpu skinning :)
One note, instead of accumulating your bone matrices, accumulate vertices transformed by matrices.
..
Final remark, use mat4x3 for the bone matrices, you'll save 25% of the Uniform memory and increase your FPS by 10-25%.
I'll try to figure these out when I get home from work tonight. I quickly tried the 4x3 matrix tweak but was getting shader compilation errors.
I was able to do it with 3x3 matrix but the arms on the nodes were mangled up.
This whole little side project is my first experience with shaders and matrices. I don't really know what I'm doing yet. >.<
I'm more of a learn by example type.
Dream Big Or Go Home.
Help Me Help You.
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: Setting Bones From Matrix

Post by devsh »

Now that I told you that we have the X Hz only Boning mode in IrrBAW, one could take it a whole step further and render to screen+save triangles to transform feedback (possible with the same shader in the same draw call) only when the current keyframe changes and then just draw the mobs in one drawcall (huge batch) with transform feedback draw until the mobs change the frame they're animating on.

This would add another 100% based on my comparisons with a shader without any skinning.

Also you'd get the shadow-pass for half-price (meshes stay static between shadow and main render).


Not mentioning that you could account for the max camera movement in 30 or 120Hz and frustum cull the triangles of your mobs :D .
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: Setting Bones From Matrix

Post by devsh »

So anyway, this is my half-assed semi-optimized attempt at 625 skinned dwarves animating at different speeds with an Omidirectional Point Light with Shadows.

Cubemap shadows are drawn in one render pass using layered rendering with gl_Layer and geometry shader hardware-instancing (fixed function tessellation SM 5.0).

Could optimize with what I said above, as well as using the OpenGL extension to specify explicitly MSAA sample locations for 8x or 16x less pixel shader invocations during depth only pass, but I'd have to switch from a cubemap FBO attachment to a Multisample Array.
And then obviously a custom resolve, but that could be combined with mipmap chain generation and the blur pass for VSM or other.

So anyway, 56FPS on Nvidia 1060 laptop GPU
Image
(You'd get about 7FPS)

But the good news is that with 100 dwarves I get 1000 FPS.
Post Reply