Shader Instancing

Announce new projects or updates of Irrlicht Engine related tools, games, and applications.
Also check the Wiki
slavik262
Posts: 753
Joined: Sun Nov 22, 2009 9:25 pm
Location: Wisconsin, USA

Shader Instancing

Post by slavik262 »

Spot the difference between these two 25 x 25 x 25 grids of cubes: :wink:

Image

Image

Shader instancing uses a vertex shader to draw up to 60 instances of an object at once, each with its own transform. In simpler terms, it lets you draw up to 60 copies of an object, each with it's own position, scale, and rotation that can continuously change. This is similar to Lonesome Ducky's mesh comibner, with the following differences:
  • Shader instancing renders several copies of a single mesh at once. It cannot render multiple meshes. However,
  • Individual instances in a groups of objects using shader instancing can move, rotate, and scale independantly of each other, making them ideal for large groups of moving objects.
Where Duky's mesh combiner is good for static things like scenery, this excels in groups of moving objects. Practical examples include a barrage of missiles, bullets or bullet casings.

My code uses empty scene nodes to store the transforms, so adding an instance returns a pointer to a scene node. This lets you treat each instance like any other scene node, letting you apply animators, set position, etc. It works well on even low level hardware as long as it supports Shader Model 2.0 (the screenshots you see above were taken on a laptop with integrated graphics).

I'm finishing up the code and trying to figure out a few remaining issues. One of them is related the material type - since this uses a shader do its magic, any custom material types also have to use the instancing code in their shader.
BlindSide
Admin
Posts: 2821
Joined: Thu Dec 08, 2005 9:09 am
Location: NZ!

Post by BlindSide »

I'm finishing up the code and trying to figure out a few remaining issues. One of them is related the material type - since this uses a shader do its magic, any custom material types also have to use the instancing code in their shader.
How about the ordinary material types? I found that if you just use the vertex shader and don't pass a pixel shader, most of the necessary things are handled by the base material you set. Just make sure to perform lighting, pass color, etc in the vertex shader.

EDIT: Oh yeah and great work! I'm amazed at the quality of stuff I'm seeing on here lately.
ShadowMapping for Irrlicht!: Get it here
Need help? Come on the IRC!: #irrlicht on irc://irc.freenode.net
Lonesome Ducky
Competition winner
Posts: 1123
Joined: Sun Jun 10, 2007 11:14 pm

Post by Lonesome Ducky »

Ah, this is what I had in mind when I wrote my code, but I decided it would be better just for static environments. This looks like exactly what I need though! Can you draw more instances with Pixel Shader 3.0? And are their any plans on supporting animated meshes?
slavik262
Posts: 753
Joined: Sun Nov 22, 2009 9:25 pm
Location: Wisconsin, USA

Post by slavik262 »

Shader Model 3 actually allows you to pass a 4x4 transformation matrix as vertex data (just like an extra texture coordinate), limiting the amount of instances you can render at once only by the size of your mesh buffer. This is usually called "hardware instancing." The problem is that I don't know how to do this, and even if I did I assume that it would require a decent amount of modification to Irrlicht.

I haven't done any work with animations, but it should be fairly straightforward and I can see it being integrated in.
3DModelerMan
Posts: 1691
Joined: Sun May 18, 2008 9:42 pm

Post by 3DModelerMan »

These are vertex shaders? I thought you couldn't do that without geometry shaders.
That would be illogical captain...

My first full game:
http://www.kongregate.com/games/3DModel ... tor#tipjar
slavik262
Posts: 753
Joined: Sun Nov 22, 2009 9:25 pm
Location: Wisconsin, USA

Post by slavik262 »

Here's the basics (all can be done with SM 2.0):
  1. Convert your mesh buffers to mesh buffers with 2 texture coordinates.

    Code: Select all

    duplicatedMesh = SceneManager->getMeshManipulator()->createMeshWith2TCoords(mesh);
  2. Duplicate your vertices until you have 60 copies of the original mesh buffer in one mesh buffer or the maximum you can fit in the mesh buffer, whichever comes first.
  3. Number each copy of the original mesh buffer using the texture coordinates (they serve as an index for the copy).
  4. In your shader, create an an array of 60 float4x4 transform matrices (60 is the SM 2.0 max according to MSDN).

    Code: Select all

    float4x4 worldTransforms[60];
    
  5. In the shader callback, copy the transforms you need into a float array and use setVertexShaderConstant to send it to the shader array. Since you most likely won't have a number of instances divisible by 60, you can cheat and make unused instances "invisible" by setting their scale to 0 (set indices 0, 5, and 10 of the transform to 0).

    Code: Select all

    it = instances.begin();
    timesDuplicated = timesBufferDuplicated[m];
    //Draw all instances
    while(it != instances.end())
    {
    	//Copy [timesDuplicated] (<= 60) instances into the transform
    	//matrix array and do a render
    	for(u32 i = 0; i < timesDuplicated; ++i)
    	{
    		//Even if we've set all the instances, we still need to
    		//make the rest of the instances invisible
    		if(it == instances.end())
    		{
    			//Set scale to 0 (ad hoc invisibility)
    			worldTransforms[i][0] =
    				worldTransforms[i][5] =
    				worldTransforms[i][10] = 0.0f;
    			continue;
    		}
    		instance = *it;
    		if(instance->isVisible())
    		{
    			memcpy(worldTransforms[i], instance->getAbsoluteTransformation().pointer(),
    				16 * sizeof(f32)); //16 floats in a 4x4 transform matrix
    		}
    		++it;
    	}
    	vd->drawMeshBuffer(duplicatedMesh->getMeshBuffer(m));
    
  6. Draw the mesh buffer with the duplicated vertices. In the vertex shader, multiply the incoming vertex position by the transform determined by the texture coordinate index of the vertex. Then multiply by the view * projection matrix.

    Code: Select all

    struct VertexInput
    {
    	float3 position: POSITION;
    	float4 normal : NORMAL;
    	float2 uv : TEXCOORD0;
    	float2 instanceIndex : TEXCOORD1;
    	float4 color : COLOR0;
    };
    
    struct VertexOutput
    {
    	float4 screenPos : POSITION;
    	float4 color : COLOR0;
    	float2 uv : TEXCOORD0;
    };
    
    VertexOutput InstancingVS(VertexInput IN)
    {
    	VertexOutput OUT = (VertexOutput)0;
    	int index = IN.instanceIndex.x;
    	float4x4 worldViewProjection = mul(worldTransforms[index], viewProjection);
    	OUT.screenPos = mul(float4(IN.position, 1), worldViewProjection);
    	OUT.color = IN.color;
    	OUT.uv = IN.uv;
    
    	return OUT;
    }
    
Granyte
Posts: 850
Joined: Tue Jan 25, 2011 11:07 pm
Contact:

Re: Shader Instancing

Post by Granyte »

Anyone tryed to use the instancing recently?

i tryed to build a class to encapsulate it and it just fails

Code: Select all

class InstanceShaderCallBack : public video::IShaderConstantSetCallBack
{
public:
 
                core::matrix4 instanceWorldArray[60];
                InstanceShaderCallBack(){};
        virtual void OnSetConstants(video::IMaterialRendererServices* services,
                        s32 userData)
                {
                        video::IVideoDriver* driver = services->getVideoDriver();
                        core::matrix4 viewProjection;
            viewProjection = driver->getTransform(video::ETS_PROJECTION);
            viewProjection *= driver->getTransform(video::ETS_VIEW);
                        services->setVertexShaderConstant("instanceWorldArray", (f32*)instanceWorldArray, 16*60);
                services->setVertexShaderConstant("viewProjection", viewProjection.pointer(), 16); 
                }
                
                
                
};
 
 
 
class CInstancingManager : public irr::scene::ISceneNode
{
public:
        CInstancingManager(scene::ISceneNode* parent, scene::ISceneManager* mgr, s32 id)
         : scene::ISceneNode(parent, mgr, id)
        {
                InstanceCount=0;
                io::path vsFileName; // filename for the vertex shader
        io::path psFileName; // filename for the pixel shader
                psFileName = "instancing.hlsl";
        vsFileName = psFileName; // both shaders are in the same file
                video::IGPUProgrammingServices* gpu = mgr->getVideoDriver()->getGPUProgrammingServices();
                CurrentInstanceShader = new InstanceShaderCallBack();
        s32 InstancingMat = gpu->addHighLevelShaderMaterialFromFiles(
                                vsFileName, "vs_main", video::EVST_VS_3_0,
                                psFileName, "ps_main", video::EPST_PS_3_0,
                                CurrentInstanceShader, video::EMT_SOLID);   
                Material.MaterialType = (E_MATERIAL_TYPE)InstancingMat;
                AABB.MaxEdge.X = 0;
                AABB.MaxEdge.Y = 0;
                AABB.MaxEdge.Z = 0;
                AABB.MinEdge.X = 0;
                AABB.MinEdge.Y = 0;
                AABB.MinEdge.Z = 0;
                this->setAutomaticCulling(EAC_OFF);
        }
        InstanceShaderCallBack* CurrentInstanceShader;
        irr::core::list<ISceneNode*> instances;
 
        InstanceShaderCallBack* InstancingShader;
        
         video::SMaterial Material;
         SMeshBufferLightMap dupBuffer;
     int InstanceCount;
 
        ISceneNode* AddMesh(IMesh* aMesh)
        { 
                aMesh->grab();
                IMesh* bMesh = SceneManager->getMeshManipulator()->createMeshWith2TCoords(aMesh);
        IMeshBuffer* bBuffer = bMesh->getMeshBuffer(0);
 
        //create dupBuffer with bBuffer repeated NUM_BATCH_INSTANCES times
       
        
                S3DVertex2TCoords* verts = (S3DVertex2TCoords*)bBuffer->getVertices();
                for (u32 i=0; i<bBuffer->getVertexCount(); ++i)
                {
                        verts[i].TCoords2.X = InstanceCount;//assign the index of instance that each vertex belongs to
                }
               dupBuffer.append(bBuffer->getIndices(),bBuffer->getVertexCount(),bBuffer->getIndices(),bBuffer->getIndexCount());
                                
        InstanceCount++;
                
        aMesh->drop();
        bMesh->drop();
 
        //save transformation in one EmptySceneNode which doesn't render itself
        
                ISceneNode* empty = SceneManager->addEmptySceneNode();
                                empty->setPosition(vector3df(1));
                empty->setScale(vector3df(1));
                empty->setRotation(vector3df(1));
                                empty->setVisible(true);
                instances.push_back(empty);
                                empty->grab();
                                return empty;
     }
        
        
        core::aabbox3d<f32> AABB;
        
        
        void OnAnimate()
        {
                irr::core::list<irr::scene::ISceneNode*>::Iterator it = instances.begin();
 
                for(u32 i = 0; i < 60; ++i)
        {
      //Even if we've set all the instances, we still need to
      //make the rest of the instances invisible
      if(it == instances.end())
      {
         //Set scale to 0 (ad hoc invisibility)
                        CurrentInstanceShader->instanceWorldArray[i][0] =
            CurrentInstanceShader->instanceWorldArray[i][5] =
            CurrentInstanceShader->instanceWorldArray[i][10] = 0.0f;
                        continue;
                }
                ISceneNode* instance = *it;
                if(instance->isVisible())
                {
                        memcpy(&CurrentInstanceShader->instanceWorldArray[i], instance->getAbsoluteTransformation().pointer(),
                                16 * sizeof(f32)); //16 floats in a 4x4 transform matrix
                        AABB.MaxEdge.X = std::max(instance->getAbsolutePosition().X,AABB.MaxEdge.X);
                        AABB.MaxEdge.Y = std::max(instance->getAbsolutePosition().Y,AABB.MaxEdge.Y);
                        AABB.MaxEdge.Z = std::max(instance->getAbsolutePosition().Z,AABB.MaxEdge.Z);
                        AABB.MinEdge.X = std::min(instance->getAbsolutePosition().X,AABB.MinEdge.X);
                        AABB.MinEdge.Y = std::min(instance->getAbsolutePosition().Y,AABB.MinEdge.Y);
                        AABB.MinEdge.Z = std::min(instance->getAbsolutePosition().Z,AABB.MinEdge.Z);
                }
                        ++it;
                }
        }
        virtual const core::aabbox3d<f32>& getBoundingBox() const
        {
                return AABB;
        }
 
 
        virtual void render()
        {
        
        
                IVideoDriver* driver = SceneManager->getVideoDriver();
                driver->setTransform(video::ETS_WORLD, AbsoluteTransformation);
                SceneManager->getVideoDriver()->drawMeshBuffer(&dupBuffer);
                
        }
};
any one has an idea?
fmx

Re: Shader Instancing

Post by fmx »

I'm guessing the problem is likely in your choice of 60 matrix4's
What happens when you reduce the cap to a lower value, for example 10 or 20?

AFAIK the real limit on maximum number of matrices that can be uploaded to a shader depends on the hardware (ie, the graphics card) and not only the shader-model version
Granyte
Posts: 850
Joined: Tue Jan 25, 2011 11:07 pm
Contact:

Re: Shader Instancing

Post by Granyte »

My hardware is relatively recent and quite powerfull so i don't think it's a hardware limit

the only thing i get it artifact even after reducing the maximum to 20
Mel
Competition winner
Posts: 2292
Joined: Wed May 07, 2008 11:40 am
Location: Granada, Spain

Re: Shader Instancing

Post by Mel »

My hardware is also quite new and won't me allow to use more than 59 matrices in a shader, it is a pixel shader limit, you can't set more than 4096 bytes of data in the constants registers, or at least, you can't as Irrlicht is configured to. I've seen NVidia demos which are able to use much more space, the Dawn Demo, from which the GPU gems have an article explaining their skinning system, used up to 96 matrices in a single shader, so maybe it is posible to enhance that value somehow.

Can't tell if in Open GL happens something similar though, it should work pretty much the same, or, given the case, you could use some instancing examples in Open GL which are able to surpass the limitations of DX
"There is nothing truly useless, it always serves as a bad example". Arthur A. Schmitt
Granyte
Posts: 850
Joined: Tue Jan 25, 2011 11:07 pm
Contact:

Re: Shader Instancing

Post by Granyte »

well currently i'm not hitting a limitation i'm only getting artifact no cube is rendered to the screen at all.
Granyte
Posts: 850
Joined: Tue Jan 25, 2011 11:07 pm
Contact:

Re: Shader Instancing

Post by Granyte »

also my shader code

Code: Select all

 
float4x4 viewProjection;
#define NUM_BATCH_INSTANCES 20
float4x4 instanceWorldArray[NUM_BATCH_INSTANCES];
 
struct VertexInput 
{ 
   float3 position: POSITION; 
   float3 normal : NORMAL; 
   float2 uv : TEXCOORD0; 
   float2 uv2 : TEXCOORD1; 
   float4 color : COLOR0; 
}; 
 
struct VertexOutput 
{ 
   float4 screenPos : POSITION; 
   float4 color : COLOR0; 
   float2 uv : TEXCOORD0; 
}; 
 
VertexOutput vs_main(VertexInput IN) 
{ 
   VertexOutput OUT = (VertexOutput)0;
   int index = IN.uv2.x;
   float3 WVP = mul(instanceWorldArray[index],IN.position );
   WVP = mul(viewProjection,WVP);
   OUT.screenPos = mul(WVP,float4(WVP,1)); 
   OUT.color = IN.color;
   OUT.uv = IN.uv;
 
   return OUT; 
} 
 
float4 ps_main(
                                float4 screenPos : POSITION,
                                float4 color : COLOR0, 
                                float2 uv : TEXCOORD0
                                ): COLOR0
{
return color;
}
 


I really have no idea why this not working
REDDemon
Developer
Posts: 1044
Joined: Tue Aug 31, 2010 8:06 pm
Location: Genova (Italy)

Re: Shader Instancing

Post by REDDemon »

What about change the post name into "Solid particles"? ;) very nice
Junior Irrlicht Developer.
Real value in social networks is not about "increasing" number of followers, but about getting in touch with Amazing people.
- by Me
Mel
Competition winner
Posts: 2292
Joined: Wed May 07, 2008 11:40 am
Location: Granada, Spain

Re: Shader Instancing

Post by Mel »

Hmm... i have thought also, are you updating the bounding box properly?
"There is nothing truly useless, it always serves as a bad example". Arthur A. Schmitt
Granyte
Posts: 850
Joined: Tue Jan 25, 2011 11:07 pm
Contact:

Re: Shader Instancing

Post by Granyte »

An issue with the bounding box would result in nothing being drawn to the screen i think in the current case i get interesting new glitch every time


is it possible that this would be a bug in the MB happend methode? both exemple used mb->append(clingtmb* mb2) and i updated it into passing the vertices and the indices separately since that methode is commented in the engine but maybe the isse lies here
Post Reply