I would like to try that scene, I''ll contact you shortly.
Quote
Ben Supnik
Re: performance, I found a few things with Bricksmith:
* A shader-based pipeline wasn't faster than fixed-function for the same basic work.
* CPU time is proportional to the number of bricks, not the number of vertices.
* When using the programmable pipeline and instancing techniques, BrickSmith is vertex-count bound, not CPU bound. It's not even close -- even with culling and transparent part sorting, I see CPU use of 15-25% while maxed out on fps with 30-40k parts. In this case the interesting number isn't the part count but the vertex count those parts yield. This is with a 4870 - a newer GPU would help the vertex-count bottleneck.
I wrote this up when I finished some perf measurements...
http://www.hacksoflife.blogspot.com/2013/01/instancing-for-bricksmith.html
Very interesting blog post, although I'm not that deep into OpenGL and shaders it gave me some nice pointers to try with the LDCad renderer. I too have an VBO per 'core' brick. which is drawn using drawelements per brick reference. I also do some additional grouping / sorting to limit the state changes though. I can tell you that the flat shading with the unique vertex normal pairs is about 15-20% faster (unique testing is a option in LDCad), almost the same amount as the vertex reduction, confirming the point you making in your text. Although I always use drawelements even when there are no duplicate pairs (haven't tried using drawarrays when the option is disabled yet).
Anyway the main reason I want to try a non fixed approach is to supply nicer lighting (per pixel) and or (better) noticeable differences between rubber, metal and plain plastic. I'm also hoping it would allow for better bfc/culling handling (flipping normals in the shader, to account for submodel mirroring etc) like you suggest on the blog.
It would be a bonus if it's faster too :) Everything will remain optional anyway because LDCad renders on OpenGL as low as 1.1 and I would like to keep it that way.