3000th commit - IrrlichtBAW (GIT repo, v 0.3.0-gamma1)

Discuss about anything related to the Irrlicht Engine, or read announcements about any significant features or usage changes.
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver

Post by hendu »

"target specific option mismatch" means that you likely did not enable the required SSE level.
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver

Post by devsh »

The SHA256 stuff is not our fault, maybe GCC 6.3 is indeed too recent

the Makefile doesn't work except for the BAW_SERVER target or something like that, we only build from the codeblocks project for Linux and Visual Studio for Windows
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver

Post by devsh »

Added non-recursive versions of updateAbsolutePosition and needsAbsolutePositionUpdate which don't run in O(n^2) where n is the depth of the SceneNode hierarchy
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver

Post by devsh »

Abandoned the above idea

Carrying out first tests of Hardware Skinning

Current version should be fully OpenGL core compatible
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver

Post by devsh »

The repository has moved to a new address

0.2 is being committed

For the 0.3 Release:
1) Create the OpenGL context in Core Profile
2) Up the minimum OpenGL version to 4.0 (Sandy Bridge Intel, Fermi Nvidia, Radeon HD5000)
3) Abandon Linux, MAC OS X and possibly windows device in favour in SDL2
4) remove all s8,u8,s16,u16,s32,u32,s64,u64 and f32 and replace with float and types from stdint.h
5) CPU culling workarounds for small instance counts to avoid GPU idling
6) Bounding Box culling to avoid animating/boning meshes which are guaranteed to be off-screen
7) Better CPU to GPU Mesh conversion modes (making sure vertex attributes are interleaved)
8) Native irrlicht mesh format save and load + encryption (index and attribute buffers)
9) InstancedSkinnedMeshSceneNode with LoDs (together with a re-skin function for cpu meshes to reduce the number of bone weights per vertex)
10) 1D Textures
11) Compute Shaders
12) Quantization optimization post-load for CPU mesh vertex attributes
13) MultiDraw{Indirect} and some other Vulkan like features
14) My own render state update system, as some render commands/materials dont need to check and update every render state
15) Assimp for model format import/export which will load textures without putting them in GPU memory, this will enable multi-threading of mesh loading

Sometime Later:
A) Atomic Reference Counting
B) STL vector and list used in place of irrlicht's containers
C) C++x11 mutexes and maybe some other stuff
D) GPU Boning
E) SSE3 SIMD Dual Quaternion Class
F) Dual Quaternion Skinning
G) SSE3 SIMD classes used for all 2D/3D math
H) Multisample renderbuffers and textures

Version 1.0 Roadmap:
1) AVX/AVX2 versions of all SIMD stuff with separate library builds
2) Vulkan Renderer
3) Android builds
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver

Post by devsh »

I'm guessing my gcc version is too recent for irr 1.8.3. Latest Irrlicht trunk revision compiles fine - I will try to look into this later.
The SHA256 stuff is not our fault, maybe GCC 6.3 is indeed too recent
We (with bkeys) found out that it was due to GCC 6 using std=gnu++14 as default dialect, so you need to set the -std=c++03 as an extra CFLAGS
bkeys
Posts: 42
Joined: Sat Feb 27, 2016 9:06 pm

Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver

Post by bkeys »

devsh wrote:I threw in an OpenCL device manager for teh lulz today
Sounds awesome! Maybe we can hang out on IRC at some point and tap up an example/documentation for this.
devsh wrote:
I'm guessing my gcc version is too recent for irr 1.8.3. Latest Irrlicht trunk revision compiles fine - I will try to look into this later.
devsh wrote: We (with bkeys) found out that it was due to GCC 6 using std=gnu++14 as default dialect, so you need to set the -std=c++03 as an extra CFLAGS
Someone at some point should try to bump up the C++ version, perhaps we can move a few standards ahead in time and the engine still compile as it did before. If no one else has the time I may do this in a few weeks.
- Brigham Keys, Esq.
christianclavet
Posts: 1638
Joined: Mon Apr 30, 2007 3:24 am
Location: Montreal, CANADA
Contact:

Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver

Post by christianclavet »

Hi Devsh!

Decided today to tryout the engine by compiling one of the demos (Hardware instancing). I must be doing something really wrong!
Here is a picture of it running on my 4k screen with my GTX 1080:
Image

It give only 19 fps?! I know there are lots of model on screen but with the hardware I have I was not expecting this (Play most of my games at max quality on a 4k screen and get 60fps+ most of the time)
EDIT: Changed some values in your demo to get only one "cow" and the primitive count is really high (like 5000 primitives). Are theses "primitives" are tris?
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver

Post by devsh »

the primitive count is really high, plus make sure you compile with O3 optimizations

the slowdown is most probably due to your CPU and PCIe bus, it needs to read back the occlusion results before drawing the next batch so there is a stall
Next version will determine LoD instance arrays in one pass (4 at a time) and the one after that will use indirect draws
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver

Post by devsh »

Found and straightened out some bugs in ver 0.2 (yet to commit):
1) bug in window format finding GL core profile Linux+Nvidia giving transparent windows (pretty cool feature though)
2) bug in setTexture where if old texture was removed it would crash
3) bad quantization optimization in the OBJ loader resulting in messed up meshes
4) CPU culling workarounds for small instance counts to avoid GPU idling
5) Normal quantization to 30bit in x meshes which use Skinning

Stuff left to do for the 0.3 release:
1) Abandon Linux, MAC OS X and possibly windows device in favour in SDL2
2) remove all s8,u8,s16,u16,s32,u32,s64,u64 and f32 and replace with float and types from stdint.h
3) Remove core::stringc, array and list and replace with SSE3,AVX and 4096bit alignment (page-locked) friendly std allocators
4) Assimp for model format import/export which will load textures without putting them in GPU memory, this will enable multi-threading of mesh loading

Roadmap for 0.4 release:
1) Global mesh optimization function (do forsyth index optimization and re-quantization into integers)
2) Better CPU to GPU Mesh conversion modes (making sure vertex attributes are interleaved)
3) Native irrlicht mesh format save and load + encryption (index and attribute buffers)
4) Shader Subroutines -> My own render state update system, as some render commands/materials dont need to check and update every render state
5) Bounding Box culling to avoid animating/boning meshes which are guaranteed to be off-screen
6) InstancedSkinnedMeshSceneNode with LoDs (together with a re-skin function for cpu meshes to reduce the number of bone weights per vertex)
7) 1D Textures
8) Compute Shaders
9) MultiDrawIndirect and some other render-command-list like features

Roadmap for 0.5:
1) Super-fast per-thread malloc/new pool allocator
2) Super-fast thread-safe malloc/new operators

Roadmap for 0.6:
1) SSE3 SIMD classes used for all 2D/3D math
2) AVX/AVX2 versions of all SIMD stuff with separate library builds
3) Vulkan Renderer
4) Android builds

Sometime Later:
A) Atomic Reference Counting
B) C++x11 mutexes and maybe some other stuff
C) GPU Boning
D) SSE3 SIMD Dual Quaternion Class
E) Dual Quaternion Skinning
G) Multisample renderbuffers and textures
Last edited by devsh on Sat Feb 25, 2017 3:45 pm, edited 1 time in total.
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver

Post by devsh »

As soon as we cap GL version to 4.2 or require ARB_shader_image_load_store/EXT_shader_image_load_store, we will use a unified shader for culling all instances (or at least batches of instances) at once.

The above approach could also be used for updating the transformations of the scenegraph, skeletal animation, culling and indirect drawing of everything.

P.S. GL_ARB_shader_atomic_counters seems to be present on all Intels with GL 4.0+, so its a possibility to do the instance culling like that (or image load store)
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver

Post by devsh »

Version 0.2.2 will be appearing shortly as soon as the bugs marked with * will be resolved and a windows project is updated

Bugs Fixed:
1) Transparent Windows on AMD/NVidia in GL Core Profile under Linux
2) Bad Mesh Quantization for the OBJ loader leading to slightly miss-placed UVs and Vertices
3) Bug in setTexture where is an old texture was remoed it would crash
4) Bug where Irrlicht wouldn't compile with GCC 6.3 (needs the option -std=c++03) because of some SHA256 code
5) Bug where some meshes in a Skinned X-Format Mesh attached to a bone but without weights would be in the wrong place (actually a quaternion bug )

New Features:
1) OpenCL device stub, finds associated device to the rendering GPU and can report number of GPU cores
2) CPU culling workarounds for small instance counts to avoid GPU idling
3) Normal quantization to 30bit in x meshes, even ones which use Skinning
4) MultiDrawIndirect and DrawIndirect handles to GL functions (can use this OpenGL feature if desired)
5) Minimum GL version is now 4.0 plus a few ubiquitous extensions, until Intel retires all Bay-Trail SOCs and Ivy Bridge CPUs

The engine minimum requirements will be for Nvidia GeForce 400 series, Radeon HD 5000 series, and Intel HD Graphics bundled with Ivy Bridge CPUs and up
Roadmap for the Immediate Next Release (In the order Features Will appear):
1) Bounding Box culling to avoid animating/boning meshes which are guaranteed to be off-screen
2) Uniform Buffer Objects and getting rid of setShaderConstant
3) New Material State Tracking system
4) Separation of BaseMaterial (essentially Blend State) from actual shader programs
5) Shader Subroutines
6) InstancedSkinnedMeshSceneNode with LoDs (together with a re-skin function for cpu meshes to reduce the number of bone weights per vertex)
7) GPU Boning
8) Expensive Operation (Cass Everit's "OpenGL Beyond Porting) Profiling and Tracking (MRT change, Shader Program, ROP, etc.)

Roadmap for Version 0.3:
1) SDL 2 Device
2) removal of irrlicht types like s8,u8,c8 etc.
3) Migrating to std::vector and list instead of core::array
4) Compute Shaders and SSBOs
5) Quaternion only rotations
6) SIMD only vector math

Roadmap for Version 0.4:
1) ASSIMP for model format import/export which will load textures without putting them in GPU memory, this will enable multi-threading of mesh loading
2) Global mesh optimization function (do forsyth index optimization and re-quantization into vertex attribute formats with less bit-depth)
3) Better CPU to GPU Mesh conversion modes (making sure vertex attributes are interleaved)
4) 1D Textures

Roadmap for 0.5:
1) Super-fast per-thread malloc/new pool allocator
2) Super-fast thread-safe malloc/new operators
3) Full-GPU compute shader scenegraph update and drawcommand generation
4) Nvidia Bindless, Sparse Textures and NV command-list

Roadmap for 0.6:
1) Bumping Minium GL Version to 4.3
2) AVX/AVX2 versions of all SIMD stuff with separate library builds
3) Vulkan Renderer
4) Android builds

Sometime Later:
A) Atomic Reference Counting
B) Bumping C++ version to C++x11 and use C++x11 mutexes and maybe some other stuff
C) SSE3 SIMD Dual Quaternion Class
D) Dual Quaternion Skinning
E) Multisample renderbuffers and textures
Last edited by devsh on Wed Mar 15, 2017 5:33 pm, edited 2 times in total.
Post Reply