3000th commit - IrrlichtBAW (GIT repo, v 0.3.0-gamma1)

Discuss about anything related to the Irrlicht Engine, or read announcements about any significant features or usage changes.
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by devsh »

1) TRUE
2) Meld or any other decent GUI diff tool allows for diffing two directories (if this wasnt possible I would be unable to merge to latest irrlicht)
3) No demos will work as irrlicht's GUI is removed (but examples 1 through 3 will work with a little modification)
4) EMT_SOLID is defined as a shader and will work... EMT_TRANSPARENT_ALPHA_REF is basically a solid shader with "discard;" thrown in (you need to do that yourself I believe), but EMT_TRANSPARENT_ALPHA_CHANNEL should already be there and work if you do material.BlendOperation = video::EBO_ADD;
christianclavet
Posts: 1638
Joined: Mon Apr 30, 2007 3:24 am
Location: Montreal, CANADA
Contact:

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by christianclavet »

Thanks devsh.

I'm sorry, won't be able to review and check your work for now. It's too much work. And I'm really too noob in certain aspect to really be efficient with this. Adding stuff to Irrlicht for me is fine but doing all of this at the moment for me is too hard.

I still want to congratulate you on this, because the images you and your team presented in the forum are really incredible. Perhaps at a later time when the project is more accessible.
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by devsh »

I'd like to update that we're still going in the direction of OpenGL 3.2 core profile and OpenGL 3.1 forward compatible profile IVideoDriver, but obviously due to other workloads the irrlicht development has stopped

But now I'm getting reminders about the Mac version of the game so OpenGL 3.2 core is inevitable

Also there remains the persistent problem of CPU-GPU syncpoints so deferred updating (i.e. proper streaming) of VBOs is coming
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by devsh »

prime example why we should ditch DX8,DX9, and OpenGL compatibility/fixed function pipeline

http://www.omgubuntu.co.uk/2014/05/supe ... provements

otherwise this shall become an engine of mediocre mobile apps
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by devsh »

update:

detected serious shadowing variable problems, one example was IQ3Shader.h where a wave function was being evaluated, inside the function local variables were defined which shadowed members and then got written to
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by devsh »

SIMD vectors being implemented this very moment

however it will take longer for them to creep into the actual engine core

head over to http://irrlicht.sourceforge.net/forum/v ... =9&t=50230
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by devsh »

Update:
we have crossplatform condition variables and faster mutexes on windows
christianclavet
Posts: 1638
Joined: Mon Apr 30, 2007 3:24 am
Location: Montreal, CANADA
Contact:

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by christianclavet »

Just seen your post about "Irrlicht being removed from SuperTuxCart". I was afraid they'd really removed it, but they only switched all rendering from fixed pipeline to shaders. Still I think it's bad as most people will think that they removed Irrlicht to a new engine.

Hope, IrrRenderer will get new updates, and it would be really interesting to have a "shader" library available to Irrlicht. At least if we could have something "solid" on the rendering side, that could be added as an ADDON to the Vanilla Irrlicht. (IRRExt)
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by devsh »

UPDATE:

Soon I shall be going back to GPU/Irrlicht development after a long spell of working on the server side.

One of the first items on my list is to design a significant rework (almost rewrite) of Irrlicht. The most impressive change will be the buffer centric design, where almost all data is mapped from an IBuffer object (which may or may not be on the GPU), such that many meshbuffers can store their data in the same VBO etc.

The second biggest change is the addition of OpenGL multi-threading (sane MT), with 2 threads for upload and download streaming. For that purpose the OpenGL feature of GPU fences will be added.
This will be followed by the decoupling of the I/O system from the IVideoDriver (so that they can run in separate threads), this will enable us to put the rendering loop in a different thread while polling for user input and doing main-thread game logic in another (the main).

The final big change is the adoption of DSA and migration to OpenGL 3.3 core profile (lowest supported context), and using the OpenGL Debug Output extension (enabling the driver to report when you are causing errors, but without having to put glGetError() after every opengl call, and display warnings about CPU-GPU synch and stuff you're doing wrong).
hendu
Posts: 2600
Joined: Sat Dec 18, 2010 12:53 pm

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by hendu »

Good luck. MT and fences are buggy on most drivers.
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by devsh »

When dealing with Intel or AMD APU's, its gives an epic boost (because all the transfer is CPU side because GPU shares page-locked RAM with CPU).

Although I haven't decided if to Multithread OpenGL with multiple contexts (shared resources), because that incurs some constant frame penalty for dealing with 2 contexts.
But still there will be a separate thread for playing with pinned/mapped buffer memory.
REDDemon
Developer
Posts: 1044
Joined: Tue Aug 31, 2010 8:06 pm
Location: Genova (Italy)

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by REDDemon »

devsh wrote: ---
This will be followed by the decoupling of the I/O system from the IVideoDriver (so that they can run in separate threads), this will enable us to put the rendering loop in a different thread while polling for user input and doing main-thread game logic in another (the main).
---
Even Id software tried that , on some GPUs you just get instable rendering (the movement is not fluid because updates from input to renderer are random due to thread scheduling, very noticeable at 60 or 30 fps). Infact in Doom3 you can disable that with an option. Probably that's no longer an issue on any modern system.

There's no need for 2 contexts I think:

-Upload can be done with buffer orphaning (not very good on Intel cards, but overall the best perfoming solution) from same render thread (you even have 1 less thread XD)
-Download from GPU should stall (taking a screenshot)
-If you need download because you want to upload data again, then just keep data on GPU (render to VBO or to Texture)
-Of course occlusion queries should still be performant

(memory fences are really bugged, and their performance is really dependend on context and driver/gpu: instead you could start thinking using a modern api like mantle or vulkan wich is async "at core")
Junior Irrlicht Developer.
Real value in social networks is not about "increasing" number of followers, but about getting in touch with Amazing people.
- by Me
devsh
Competition winner
Posts: 2057
Joined: Tue Dec 09, 2008 6:00 pm
Location: UK
Contact:

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by devsh »

Upload can be done with buffer orphaning (not very good on Intel cards, but overall the best perfoming solution) from same render thread (you even have 1 less thread XD)
No, if you read OpenGL Insights, you find out best perf option is glMapBuffer(Range) with GL_UNSYNCHRONISED
AGAIN, YOU NEED TO THREAD (Multiple Context) to use DMA on Nvidia and to save time on Intel and AMD APUs (different core copies poop around in RAM to page-locked location).
Download from GPU should stall (taking a screenshot)
Why??? I think, that you think, that if you command a screenshot download, you will grab the very frame that is displaying on the screen right now... THAT IS NOT TRUE, you will grab the frame generated after all the previous OpenGL commands have completed, but you will stall the CPU for no reason whatsoever.
If you need download because you want to upload data again, then just keep data on GPU (render to VBO or to Texture)
And how am I going to perform non-parallelizable operations on buffers???
Of course occlusion queries should still be performant
Occlusion Queries are a performance bottleneck from the very design of the API, they cannot be batched (dispatch occlusion queries for 1000 bboxes in parallel). Hence a different solution has to be pursued!
(memory fences are really bugged, and their performance is really dependend on context and driver/gpu: instead you could start thinking using a modern api like mantle or vulkan wich is async "at core")
You have terms confused here, there are no "memory fences" in OpenGL spec, there are "fences" and "memory barriers". I don't care about "memory barriers", they're to do with Compute and Image Load/Store. I only care about "fences", fences are very simple to implement for driver developers (its basically mutex cond. wait) because they're only driver-CPU side (and should be GPU-arch agnostic), so I highly doubt they are broken.
Vulkan Spec is not even finalized, let alone implemented yet... You can be sure that as soon as its out I will ditch OpenGL 4.1+ support in favour of a Vulkan implementation (while keeping OpenGL 3.3-4.0 core + extensions for older GPUs).
REDDemon
Developer
Posts: 1044
Joined: Tue Aug 31, 2010 8:06 pm
Location: Genova (Italy)

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by REDDemon »

devsh wrote: No, if you read OpenGL Insights, you find out best perf option is glMapBuffer(Range) with GL_UNSYNCHRONISED
AGAIN, YOU NEED TO THREAD (Multiple Context) to use DMA on Nvidia and to save time on Intel and AMD APUs (different core copies poop around in RAM to page-locked location).
Already readed that. This is not only for speed reasons, orphaning may be not the fastest solution, but is the more reliable. Orphaning can be done without taking to much care in other code, glMapBuffer may cause partially updated content to be uploaded.
OpenGL insights wrote: The drawback is that we really have to know what we’re doing. No implicit sanity check or synchronization is performed, so if we upload data to a buffer that is currently being used for rendering, we can end up with an undefined behavior or application crash.

yep vulkan is not ready but mantle it is ;)
Junior Irrlicht Developer.
Real value in social networks is not about "increasing" number of followers, but about getting in touch with Amazing people.
- by Me
Cube_
Posts: 1010
Joined: Mon Oct 24, 2011 10:03 pm
Location: 0x45 61 72 74 68 2c 20 69 6e 20 74 68 65 20 73 6f 6c 20 73 79 73 74 65 6d

Re: To The Rescue of Your FPS - Build A World's Irrlicht

Post by Cube_ »

Huge props to this, you just saved me a ton of headaches since I was thinking I'd have to fix occlusion queries, strip rendering engines I don't use (DX, software), and see if I could hack together some 2D texture array implementation that does not require OGL 3.x (I'd assume your implementation does require OGL 3.x, probably what I'll have to do is make a texture array dummy class that makes an in-memory atlas, keeps texture offsets in the atlas and then calls them but still lets me specify the texture as if a texture array was used)
"this is not the bottleneck you are looking for"
Post Reply