LDraw.org Discussion Forums
invalid utf8 in part 2902 (2902s1.dat) - Printable Version

+- LDraw.org Discussion Forums (https://forums.ldraw.org)
+-- Forum: Models and Parts (https://forums.ldraw.org/forum-18.html)
+--- Forum: Parts Authoring (https://forums.ldraw.org/forum-19.html)
+--- Thread: invalid utf8 in part 2902 (2902s1.dat) (/thread-27125.html)



invalid utf8 in part 2902 (2902s1.dat) - Jonathan N - 2023-02-21

Hello! I'm new to these forums, so sorry if this is the wrong place. I'm working on code to load ldraw files and noticed that not all parts seem to use valid utf8. The file "LDraw\parts\s\2902s01.dat" contains comment lines that don't decode properly as utf8 in programming languages like Python or Rust due to a missing continuation byte. The comment renders in some editors as "0 Cran creus�". I would expect the e with an accent to be C3A9 in hex, but the file is only using the single byte E9. The entire phrase appears in the file as 3020 4372616E 20637265 7573E9. LDView displays the text just fine as "0 Cran creusé". Is this part valid? Should programs match the behavior of LDView?


RE: invalid utf8 in part 2902 (2902s1.dat) - Orion Pobursky - 2023-02-21

Im aware of the UTF issues on some (3 by my count) parts. These will be fixed when the entire library is reissued next update (tentatively schedule for this coming weekend, 2/24ish).


RE: invalid utf8 in part 2902 (2902s1.dat) - Jonathan N - 2023-02-22

(2023-02-21, 23:50)Orion Pobursky Wrote: Im aware of the UTF issues on some (3 by my count) parts. These will be fixed when the entire library is reissued next update (tentatively schedule for this coming weekend, 2/24ish).

Thanks for the quick response. Decoding all dat files in my ldraw parts library that I downloaded today gives me errors for "parts/s/2902s01.dat" and "parts/s/87606s01.dat", so the update should cover it.


RE: invalid utf8 in part 2902 (2902s1.dat) - Orion Pobursky - 2023-02-22

(2023-02-22, 0:03)Jonathan N Wrote: Thanks for the quick response. Decoding all dat files in my ldraw parts library that I downloaded today gives me errors for "parts/s/2902s01.dat" and "parts/s/87606s01.dat", so the update should cover it.

A quick check would be to download the official parts from the current library:
https://library.ldraw.org/official/parts/s/2902s01.dat

The direct download should have the issue fixed. If it's not let me know.


RE: invalid utf8 in part 2902 (2902s1.dat) - Travis Cobbs - 2023-02-22

(2023-02-21, 23:44)Jonathan N Wrote: LDView displays the text just fine as "0 Cran creusé".

Regarding LDView. It looks like there is a bug in LDView's conversion of LDraw file text to Unicode for display in the model tree. I'll fix that.