The LDraw file format has held up surprisingly well in spite of its age. I’ve outlined a number of limitations when trying to achieve the best performance and results with modern hardware and applications. Any application wanting to work with and render LDraw files efficiently at scale needs to apply significant processing. A new version of the LDraw format should keep this processing in mind because it adds significant runtime cost as well as being ambiguous and difficult to implement.
I’ve worked with implementing and optimizing code for rendering and loading a number of binary and text file formats for 3D models in the video game modding scene. These are non issues with popular formats like glTF or collada and custom binary formats used for modern games. I’m not suggesting that we just adopt something like gltf or replace all the .dat files with .obj due to the unique requirements and optimization opportunities with LEGO models.
I’m curious to see what thoughts people have on what it would take to implement these changes. Some of these points may be more immediately obvious than others. I believe using a human readable format is a worthy goal, but we should also recognize that files are mostly read by machines. We should consider not only the needs of people authoring parts by hand but also the needs of developers. Many usability issues can be resolved with proper libraries and tooling.
Geometry
Modern GPUs are built to render indexed triangle lists. Applications that support quads will need to triangulate them before rendering. This also applies to renderers like Blender. Requiring triangles for LDraw models would simplify application code and also avoid the ambiguities that arise from quads. Future ray tracers will also likely want to work with triangles.
Applications currently need to calculate their own vertex indices. The naive approach is very slow and requires comparing each vertex with every other vertex to detect duplicates. Hashing is unreliable due to rounding errors. Using a distance threshold can be made faster using special tree structures, but it still takes a lot of processing. A lot of popular model formats already include indices like gltf, collada, or wavefront obj. Vertices that come from 3D modeling applications are typically already indexed, so users wouldn’t need to manually specify indices when using these programs to author parts.
Normals
Normals are a major issue with LDraw files. Figuring out the best way to deal with line types is best left to the threads already covering those topics. For GPU rendering, normals are usually defined per vertex instead of per face. LDraw files are unusual in that vertices are not indexed, and smooth normals have to be inferred purely from edge information. 3D modeling applications like Blender typically just have users define edges as “hard” or “smooth” and calculate the vertex indexing for you behind the scenes. Normals can be inferred from the vertex positions and indices, so defining custom normals is only necessary for effects like stylized character shading. LDraw wouldn’t need to explicitly include normals as long as the vertices were indexed correctly and parts were not authored with custom normals in mind.
Subparts
Indexing won’t work well with the current subpart usage. Reducing the number of objects is also beneficial in 3D modeling applications since they need to manage a lot of per object state like unique names, bounding boxes, etc. Subparts also add a lot of small files and negatively impact IO performance for file reads as well as compressing, decompressing, or moving the LDraw parts library. Even newer NVME SSDs are still much faster when using fewer, larger files.
I’m not advocating to completely remove subparts. Subfile references in parts files can still be helpful if a part is made up of multiple sections like joints or hinges. There may be other cases where adding another file reference is justified. I’d like to calculate how much bigger the parts library would be with and without zip compression when not using subparts at some point.
Binary Formats
I don’t think binary files are worth the loss in readability. The space savings from my experience come from using reduced precision for attributes like normals. It’s going to be hard to beat the size of a text file defined in integer LDraw units and even harder after compression. The advantage of binary formats is reduced processing when loading, which matters a lot for games. For a good implementation example that’s close to what graphics APIs work with, see gltf. It’s worth noting that GPUs prefer attributes defined per vertex for cache locality, but applications prefer separate lists for attributes like positions and UV maps for flexibility. We can’t optimize for both in the same format, so the benefits of binary files are mixed.
Level of Detail (LOD)
I don’t think this is an issue on modern hardware for an optimized renderer. A bigger concern is object counts since lots of small objects add a lot of overhead and don’t fully utilize the GPU. Indexing vertices greatly reduces the actual number of vertices processed by the GPU due to vertex reuse. Techniques like frustum and occlusion culling provide significant FPS improvements without requiring any LOD changes. The move towards realtime raytracing or virtualized geometry (UE5 Nanite) will make polygon counts even less of an issue. Path tracers like Blender Cycles already scale very well with polygon counts and benefit from instancing. Memory isn’t a concern since LDraw models instance the same parts many times.
I still think level of detail could be desirable in some cases like selecting a different stud type for CAD applications than final renders. For very large scenes, it may be beneficial to use reduced geometry for distant objects. I plan to investigate this at some point for the ldr_wgpu renderer I’ve been working on as well as other optimizations.
File Format
I’m intentionally not suggesting any specific format changes in this initial post. I’d like to incorporate my rendering library into a file viewer application. This would allow displaying LDraw files in a custom text format within the application. People could use the application to open and view existing LDraw files in the current format and see what it would look like in a different text format. I may also put together a tool for converting the entire parts library and outputting to a new folder. I would want to compare things like file sizes, file count, performance, and ease of implementation in different languages with a new format before I propose any specific changes to LDraw itself.
I’ve worked with implementing and optimizing code for rendering and loading a number of binary and text file formats for 3D models in the video game modding scene. These are non issues with popular formats like glTF or collada and custom binary formats used for modern games. I’m not suggesting that we just adopt something like gltf or replace all the .dat files with .obj due to the unique requirements and optimization opportunities with LEGO models.
I’m curious to see what thoughts people have on what it would take to implement these changes. Some of these points may be more immediately obvious than others. I believe using a human readable format is a worthy goal, but we should also recognize that files are mostly read by machines. We should consider not only the needs of people authoring parts by hand but also the needs of developers. Many usability issues can be resolved with proper libraries and tooling.
Geometry
Modern GPUs are built to render indexed triangle lists. Applications that support quads will need to triangulate them before rendering. This also applies to renderers like Blender. Requiring triangles for LDraw models would simplify application code and also avoid the ambiguities that arise from quads. Future ray tracers will also likely want to work with triangles.
Applications currently need to calculate their own vertex indices. The naive approach is very slow and requires comparing each vertex with every other vertex to detect duplicates. Hashing is unreliable due to rounding errors. Using a distance threshold can be made faster using special tree structures, but it still takes a lot of processing. A lot of popular model formats already include indices like gltf, collada, or wavefront obj. Vertices that come from 3D modeling applications are typically already indexed, so users wouldn’t need to manually specify indices when using these programs to author parts.
Normals
Normals are a major issue with LDraw files. Figuring out the best way to deal with line types is best left to the threads already covering those topics. For GPU rendering, normals are usually defined per vertex instead of per face. LDraw files are unusual in that vertices are not indexed, and smooth normals have to be inferred purely from edge information. 3D modeling applications like Blender typically just have users define edges as “hard” or “smooth” and calculate the vertex indexing for you behind the scenes. Normals can be inferred from the vertex positions and indices, so defining custom normals is only necessary for effects like stylized character shading. LDraw wouldn’t need to explicitly include normals as long as the vertices were indexed correctly and parts were not authored with custom normals in mind.
Subparts
Indexing won’t work well with the current subpart usage. Reducing the number of objects is also beneficial in 3D modeling applications since they need to manage a lot of per object state like unique names, bounding boxes, etc. Subparts also add a lot of small files and negatively impact IO performance for file reads as well as compressing, decompressing, or moving the LDraw parts library. Even newer NVME SSDs are still much faster when using fewer, larger files.
I’m not advocating to completely remove subparts. Subfile references in parts files can still be helpful if a part is made up of multiple sections like joints or hinges. There may be other cases where adding another file reference is justified. I’d like to calculate how much bigger the parts library would be with and without zip compression when not using subparts at some point.
Binary Formats
I don’t think binary files are worth the loss in readability. The space savings from my experience come from using reduced precision for attributes like normals. It’s going to be hard to beat the size of a text file defined in integer LDraw units and even harder after compression. The advantage of binary formats is reduced processing when loading, which matters a lot for games. For a good implementation example that’s close to what graphics APIs work with, see gltf. It’s worth noting that GPUs prefer attributes defined per vertex for cache locality, but applications prefer separate lists for attributes like positions and UV maps for flexibility. We can’t optimize for both in the same format, so the benefits of binary files are mixed.
Level of Detail (LOD)
I don’t think this is an issue on modern hardware for an optimized renderer. A bigger concern is object counts since lots of small objects add a lot of overhead and don’t fully utilize the GPU. Indexing vertices greatly reduces the actual number of vertices processed by the GPU due to vertex reuse. Techniques like frustum and occlusion culling provide significant FPS improvements without requiring any LOD changes. The move towards realtime raytracing or virtualized geometry (UE5 Nanite) will make polygon counts even less of an issue. Path tracers like Blender Cycles already scale very well with polygon counts and benefit from instancing. Memory isn’t a concern since LDraw models instance the same parts many times.
I still think level of detail could be desirable in some cases like selecting a different stud type for CAD applications than final renders. For very large scenes, it may be beneficial to use reduced geometry for distant objects. I plan to investigate this at some point for the ldr_wgpu renderer I’ve been working on as well as other optimizations.
File Format
I’m intentionally not suggesting any specific format changes in this initial post. I’d like to incorporate my rendering library into a file viewer application. This would allow displaying LDraw files in a custom text format within the application. People could use the application to open and view existing LDraw files in the current format and see what it would look like in a different text format. I may also put together a tool for converting the entire parts library and outputting to a new folder. I would want to compare things like file sizes, file count, performance, and ease of implementation in different languages with a new format before I propose any specific changes to LDraw itself.