Irrlicht Lime is a .NET wrapper for Irrlicht Engine

Announce new projects or updates of Irrlicht Engine related tools, games, and applications.
Also check the Wiki
pixartist
Posts: 25
Joined: Thu Apr 14, 2011 4:58 pm

Post by pixartist »

Okay, thank you :)
I managed to get colors now.

Do you know how many vertices can be rendered in a scene with an acceptable framerate ? I tried ~720000 (in Quads mode) and it lagged like hell, but I need to draw a lot for what I've planned
serengeor
Posts: 1712
Joined: Tue Jan 13, 2009 7:34 pm
Location: Lithuania

Post by serengeor »

pixartist wrote:Okay, thank you :)
I managed to get colors now.

Do you know how many vertices can be rendered in a scene with an acceptable framerate ? I tried ~720000 (in Quads mode) and it lagged like hell, but I need to draw a lot for what I've planned
Depends on your hardware and how you do it.
People around here in forums get up to 2 millions of vertices and more with good framerates.
Working on game: Marrbles (Currently stopped).
pixartist
Posts: 25
Joined: Thu Apr 14, 2011 4:58 pm

Post by pixartist »

hmm weird, when I switch from openGL to Direct3d, my vertices aren't being drawn anymore :shock:
edit: uh direct3d can't draw quads ? damn..
hybrid
Admin
Posts: 14143
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany
Contact:

Post by hybrid »

No, d3d cannot draw quads at all, and opengl is slower at drawing quads and also removed that primitive in the most recent versions (with compatibility mode being off). So better use triangles.
greenya
Posts: 1012
Joined: Sun Jan 21, 2007 1:46 pm
Location: Ukraine
Contact:

Post by greenya »

Ok.
I have studied the problem with rendering of huge amount of cubes. The results is next:

1) SceneNodes is not an option for Lime (either as for native Irrlicht);

2) using DrawVertexPrimitiveList() is a most fast option for native Irrlicht, but not for Lime, because managed implementation requires to convert managed indices and vertices to unmanaged to call native drawVertexPrimitiveList(), which tremendously slows this method for huge amount of indices and vertices. I got about 22 fps for 8k cubes (96k primitives), which is unacceptable. I have made small optimization to DrawVertextPrimitiveList(): now it takes arrays (was Lists); now it copies indices faster (Marshal::Copy() is used, but this applicable only to indices, not complex type like "vertices", they can be copied only one by one). These chages already commited to svn trunk. I got 26 fps after all. Which is still too bad result. That is why DrawVertexPrimitiveList() is not good choice to render fast static mesh in Lime;

3) using meshbuffers. The problems with DrawVertexPrimitiveList() was only in inevitability to convert managed to unmanaged data. Meshbuffers hold all data already as unmanaged, so they do not need to convert any arrays before drawing. Current implementation of Lime has some support for meshbuffer manipulation, but it still lack a lot (for example, it is impossible to create empty meshbuffer by your self).

Anyway, i managed how to make them to work even using current Lime support: i have used GeometryCreator to create a mesh, which has meshbuffer, which i can use (modify);

I have made tests using meshbuffers, got next results:
8k cubes - 430 fps
125k cubes - 115 fps
216k cubes - 69 fps
512k cubes - 36 fps (this is about 6kk primitives)

Screenshot (60^3=216k cubes (~2,5kk primitives) , 69 fps)
Image
(my video card is GeForce 9600M GT)

Test code is here: http://irrlichtirc.g0dsoft.com/pastebin/view/32368373
greenya
Posts: 1012
Joined: Sun Jan 21, 2007 1:46 pm
Location: Ukraine
Contact:

Post by greenya »

In post above i used maximum possible meshbuffer fill - 65520 indices per meshbuffer - this is the maximum value which is less than 0xffff and can be divided on 36 without remaining (36 is a number of indices per single cube mesh). For testing i took rendering of 125,000 cubes (1.5kk primitives).

Results ("meshbuffer" and "chunk" are synonyms):
indicesPerChunk / numOfCubesPerChunk / totalChunks / resultingFPSwithOpenGL
65520 / 1820 cubes / 69 chunks / 118 fps
36000 / 1000 cubes / 125 chunks / 116 fps
18000 / 500 cubes / 250 chunks / 112 fps
7200 / 200 cubes / 625 chunks / 100 fps
3600 / 100 cubes / 1250 chunks / 97 fps
1800 / 50 cubes / 2500 chunks / 86 fps
1620 / 45 cubes / 2778 chunks / 78 fps
1512 / 42 cubes / 2977 chunks / 74 fps // look
1476 / 41 cubes / 3049 chunks / 6 fps // here
1440 / 40 cubes / 3125 chunks / 6 fps
1080 / 30 cubes / 4167 chunks / 6 fps

Conclusion i made:
- rendering more than 3k meshbuffers is critical;
- rendering single meshbuffer filled on 10% or 90% doesn't make noticable difference;
pixartist
Posts: 25
Joined: Thu Apr 14, 2011 4:58 pm

Post by pixartist »

very nice, when will a binary version be available ?
shadowslair
Posts: 758
Joined: Mon Mar 31, 2008 3:32 pm
Location: Bulgaria

Post by shadowslair »

In case you have missed it- he posted the code after the screenshot:
"Although we walk on the ground and step in the mud... our dreams and endeavors reach the immense skies..."
pixartist
Posts: 25
Joined: Thu Apr 14, 2011 4:58 pm

Post by pixartist »

shadowslair wrote:In case you have missed it- he posted the code after the screenshot:
well, right now device.VideoDriver.DrawVertexPrimitiveList(vertices, indices); does only take lists, not arrays

edit: oh I see the improvement for DrawVertexPrimitiveList is only marginal... hmm, vertex buffers look annoying :(
greenya
Posts: 1012
Joined: Sun Jan 21, 2007 1:46 pm
Location: Ukraine
Contact:

Post by greenya »

pixartist wrote:right now device.VideoDriver.DrawVertexPrimitiveList(vertices, indices); does only take lists, not arrays
Well, yes, as i said, i have changed lists to arrays to be able to use Marshal::Copy() method (which works only with arrays). This gives small profit but this is still too slow. I don't know how to make it work faster right now, at least i definitely know the reason - copying process. Optimizing "copying" will never give good performance. Maybe in future some ability to hold unmanaged arrays of indices and vertices will be implemented, so it will be possible to call drawVertextPrimitiveList() immediately, BUT if we look closely, we see that meshbuffers already can help us a lot.
pixartist wrote:oh I see the improvement for DrawVertexPrimitiveList is only marginal... hmm, vertex buffers look annoying :(
Next version of Lime will have better support of meshbuffers (at least it will be possible to create 16/32 bit meshbuffers and append/update indices and vertices on them). So it will be possible simply to create 32-bit meshbuffer, append copule of million indices and vertices, and just call DrawMeshBuffer(). Indeed this functionality already in the trunk. However, my tests shows, that my OpenGL draws single 32-bit mesh buffer slower on 5-10% and Direct3D9 is 5 times slower with single 32-bit meshbuffer, than drawing 200-300 16-bit meshbuffers.
pixartist
Posts: 25
Joined: Thu Apr 14, 2011 4:58 pm

Post by pixartist »

Well, I don't mind splitting up meshbuffers, as long as I don't need 50 lines of hilariously confusing code to get the indices right :D

looking forward to the update
greenya
Posts: 1012
Joined: Sun Jan 21, 2007 1:46 pm
Location: Ukraine
Contact:

Post by greenya »

Well, I don't mind splitting up meshbuffers, as long as I don't need 50 lines of hilariously confusing code to get the indices right :D
I believe if you want to use meshbuffers directly and from time to time modify its content, you anyway will need to "dance" around those indices. 8)

I have added to Lime new example (for next release), but would be good if you try it. Please tell than the results, your OS and video card. Somehow when i use 32-bit meshbuffer, my app crashes when add about 700k+ cubes, but it works just fine when i use only 512k. With 16-bit chunking i was able to draw 800k+ without crashes. Please also tell your physical memory size and report here what the maximums (for 16-bit and 32-bit meshbuffers) you was able to run without crashes.

The binnary is: http://filebeam.com/8bfdab6c3195ac7bb3cfb6d98a9a7063 [1.6 Mb]
The source code of this binnary is: http://irrlichtlime.svn.sourceforge.net ... athrev=275

P.S.: the generating cubes algorithm does really like to 'eat' free memory :) For example to generate 1kk cubes i use about 1.9Gb. After generation, when drawing, memory will be freed by .NET' garbage collector and the app will consume only about 500Mb.
pixartist
Posts: 25
Joined: Thu Apr 14, 2011 4:58 pm

Post by pixartist »

ATI HD4670, core2duo 2ghz, 3,5GB ram

216k cubes, OpenGL
16bit : 100-160 fps
32 bit: 66-102 fps
shadowslair
Posts: 758
Joined: Mon Mar 31, 2008 3:32 pm
Location: Bulgaria

Post by shadowslair »

XP, P dual 3GHz, 2GB, HD5670 1GB(no clock), default camera orientation:
60 1 b - 216k cubes - 156fps
60 2 b - 216k cubes - 166fps
"Although we walk on the ground and step in the mud... our dreams and endeavors reach the immense skies..."
pixartist
Posts: 25
Joined: Thu Apr 14, 2011 4:58 pm

Post by pixartist »

greenya wrote:
Well, I don't mind splitting up meshbuffers, as long as I don't need 50 lines of hilariously confusing code to get the indices right :D
I believe if you want to use meshbuffers directly and from time to time modify its content, you anyway will need to "dance" around those indices. 8)

I have added to Lime new example (for next release), but would be good if you try it. Please tell than the results, your OS and video card. Somehow when i use 32-bit meshbuffer, my app crashes when add about 700k+ cubes, but it works just fine when i use only 512k. With 16-bit chunking i was able to draw 800k+ without crashes. Please also tell your physical memory size and report here what the maximums (for 16-bit and 32-bit meshbuffers) you was able to run without crashes.

The binnary is: http://filebeam.com/8bfdab6c3195ac7bb3cfb6d98a9a7063 [1.6 Mb]
The source code of this binnary is: http://irrlichtlime.svn.sourceforge.net ... athrev=275

P.S.: the generating cubes algorithm does really like to 'eat' free memory :) For example to generate 1kk cubes i use about 1.9Gb. After generation, when drawing, memory will be freed by .NET' garbage collector and the app will consume only about 500Mb.
I'm sorry, I don't really understand why the indices are so complex. Basically they just give the order in which the vertices should be iterated right?
So when I have 3 vertices the indices [0,1,2] would create a face to one direction and [2,1,0] to the other, and when I have 4 vertices I could draw a rectangle using something like [0,1,2,1,3,2] right ?

so what exactly does updateVertices() do ? does it overwrite the current vertice array ? the same for indices
Post Reply