Hi Travis,
I'm using a related technique: I have a single array of vertices, but they are sorted by a "lexicographal" comparison function (that is, sort by X, if X is equal sort by Y, if Y is equal, sort by Z, etc.).
This gives you similar performance times as the balanced binary tree - NlogN to get the index built (due to sort time) and log-time lookups via a binary search.
The big thing you lose with the array is cheap insertion of "more stuff." But in the case of my code, I know the total vertex count before I even start working (because I have the unsmoothed directive and can count up the triangles and quads and lines) so this is acceptable.
The other thing I discovered is that, given a vertex in the array, it is a bit faster to find the range of equal vertices by linear search (E.g. just walk in both directions in the array) than to do a binary search of the whole sorted array. I chalk this up to locality - more cache hits if the search stays close by. I've been meaning to do a 'step' search - that is, start in the array, walk in units of N vertices, then walk back by N/2, etc. If we start N at a small number like 16 or so, we can find groups of equal value vertices in only a few steps. But...performance profiles indicate that it's a small optimization at this point; the algorithm is currently dominated by the 'big sort' at the beginning.
I'm currently sorting the _entire_ part together -- lines and tris and quads. When I had the lines separate it was faster (it's quicker to sort two small things than one big thing) but I want to have one 'pile' of vertices for snapping things together, so I think one big pile it is.
I think the big win is going to be recursion optimizations, e.g. smooth the stud, recognize in the smoothing process that the stud is "sealed" from other parts and doesn't need to be resmoothed when it is included, then smooth the parent part without the studs. I base this on the fact that (1) my code can detect that the stud is sealed and (2) small parts optimize almost instantly and (3) the big baseplates minus studs have similar vertex counts to small parts. It's a big refactor to actually get the code working this way though; I'm going to finish T junction repair first.
Cheers
Ben
I'm using a related technique: I have a single array of vertices, but they are sorted by a "lexicographal" comparison function (that is, sort by X, if X is equal sort by Y, if Y is equal, sort by Z, etc.).
This gives you similar performance times as the balanced binary tree - NlogN to get the index built (due to sort time) and log-time lookups via a binary search.
The big thing you lose with the array is cheap insertion of "more stuff." But in the case of my code, I know the total vertex count before I even start working (because I have the unsmoothed directive and can count up the triangles and quads and lines) so this is acceptable.
The other thing I discovered is that, given a vertex in the array, it is a bit faster to find the range of equal vertices by linear search (E.g. just walk in both directions in the array) than to do a binary search of the whole sorted array. I chalk this up to locality - more cache hits if the search stays close by. I've been meaning to do a 'step' search - that is, start in the array, walk in units of N vertices, then walk back by N/2, etc. If we start N at a small number like 16 or so, we can find groups of equal value vertices in only a few steps. But...performance profiles indicate that it's a small optimization at this point; the algorithm is currently dominated by the 'big sort' at the beginning.
I'm currently sorting the _entire_ part together -- lines and tris and quads. When I had the lines separate it was faster (it's quicker to sort two small things than one big thing) but I want to have one 'pile' of vertices for snapping things together, so I think one big pile it is.
I think the big win is going to be recursion optimizations, e.g. smooth the stud, recognize in the smoothing process that the stud is "sealed" from other parts and doesn't need to be resmoothed when it is included, then smooth the parent part without the studs. I base this on the fact that (1) my code can detect that the stud is sealed and (2) small parts optimize almost instantly and (3) the big baseplates minus studs have similar vertex counts to small parts. It's a big refactor to actually get the code working this way though; I'm going to finish T junction repair first.
Cheers
Ben