LDraw.org Discussion Forums

Full Version: Leading or trailing white space characters in file names
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5 6
If there aren't any arguments against it please call for a vote.

w.
The current spec says: "The whitespace characters allowed for keyword and parameter separation include spaces and tabs". I don't think this is clear and or complete at all.

So maybe replace:

Quote:Every line of the file contains one command. With few exceptions, every command is independent of other lines. The exceptions are the BFC meta-commands which modify the behaviour of one or more following command lines. There is no line length restriction. The whitespace characters allowed for keyword and parameter separation include spaces and tabs. Every command starts with a number, called a line type. The function and format of the line is determined by the line type.

with

Quote:Every line of the file contains one command. With few exceptions, every command is independent of other lines. The exceptions are the BFC meta-commands which modify the behaviour of one or more following command lines. There is no line length restriction. Every command starts with a number, called a line type. The function and format of the line is determined by the line type.

And add a section below it (just like "text based", "text encoding", etc)

Quote:Whitespace

The whitespace used to separate tokens throughout the document may only exist out the following: used to separate tokens may be any of the following:

- space
- horizontal tab
- vertical tab
- form feed

I've removed the 'new line' and 'carriage' return from that list because of the line orientated nature of LDraw.

And as a result of the above we can adjust the line:
Quote:<file> is the filename of the sub-file referenced and must be a valid LDraw filename.

to

Quote:<file> is the filename of the sub-file referenced and must be a valid LDraw filename. Any leading and/or trailing whitespaces must be ignored.

Although some native speaking English person could probably word all of this better.
I just canceled my call for votes in light of this.

In my opinion, we cannot add new white space characters to the LDraw file format at this late date. Space and tab should remain the only valid white space characters.

Also, I request that we split whitespace up into two words. The original whitespace text had it as one word, and I matched that in my original wording. However, it should be two words.

Finally, the following text is messed up:

Quote:The whitespace used to separate tokens throughout the document may only exist out the following: used to separate tokens may be any of the following:

I recommend the following wording for the new White Space section:

Quote:The white space characters used to separate tokens throughout the LDraw file may be either space or tab. Both should be treated the same, and any number of contiguous white space characters (1 or more) are allowed.
Is it relevant to mention here that tokens on line types 1-4 MUST be separated by whitespace? I belive LDView (but maybe only older versions) recognises a leading minus sign as a delimiter (e.g
Code:
1 16  0 0 0  1 0 0  0 1 0  0 0-1 s\subpart.dat
would be valid syntax).
Chris Dee Wrote:Is it relevant to mention here that tokens on line types 1-4 MUST be separated by whitespace? I belive LDView (but maybe only older versions) recognises a leading minus sign as a delimiter (e.g
Code:
1 16  0 0 0  1 0 0  0 1 0  0 0-1 s\subpart.dat
would be valid syntax).

I do not consider that valid syntax, whether recognized by any program or not. So yes, fields MUST be delimited by whitespace.

I also agree with Travis: space and tab have been the only documented delimiters for many years, and it is inadvisable to add more.

Allen
I'd say the current spec implies that white space is required between each parameter, but doesn't outright state such. I agree that this should be added. How about the following as the content of the proposed new White Space section:

Quote:Command parameters for every line type must be separated by white space. The white space characters used to separate these parameters may be either space or tab. Both should be treated the same, and any number of contiguous white space characters (1 or more) are allowed.

We also should decide whether it's valid to have white space before the line type number at the beginning of the line. I think this is also ambiguous.

Just as a note, while the described LDView behavior doesn't surprise me now that it has been pointed out, I don't think anyone has ever mentioned it to me before. And while the current spec may be somewhat ambiguous, I suspect that if you asked any part authors if your sample line was valid, they would say no. I'll update LDView to generate an error (assuming we agree to require white space between command parameters).
As a minor point, I believe that LDView will in fact reject the sample line, but only because it has special handling of type 1 lines to determine the exact location of the filename. So if anyone tests the above and sees that LDView rejects it, that shouldn't be taken as an indication that LDView isn't broken. A similar missing white space in any other line type (2-5) will probably be ignored by LDView.

For the developers out there, the following sscanf() should succeed on Chris's sample line:

Code:
if (sscanf(line, "%d %d %f %f %f %f %f %f %f %f %f %f %f %f %s", &lineType, &colorNumber, &x, &y, &z, &a, &b, &c, &d, &e, &f, &g, &h, &i, filename) == 15)
{
    // Success!
}

(Note that it has other problems, since %s stops at the first white space character, so it wouldn't work with filenames containing spaces.)
Travis Cobbs Wrote:We also should decide whether it's valid to have white space before the line type number at the beginning of the line. I think this is also ambiguous.

I think if we go with the tokens separated by with space text above, leading white space on a line is automatically valid. The only exception is the file name which at it start 'disables' the normal white space / token behavior.
Roland Melkert Wrote:
Travis Cobbs Wrote:We also should decide whether it's valid to have white space before the line type number at the beginning of the line. I think this is also ambiguous.

I think if we go with the tokens separated by with space text above, leading white space on a line is automatically valid. The only exception is the file name which at it start 'disables' the normal white space / token behavior.

I don't consider leading whitespace on a line to be valid. I think there's a difference between "separates" and "starts with." My parser will accept it some places by accident, but considers it a syntax error in many other places.

Have files containing lines with leading whitespace been observed in the wild?

Allen
Leading whitespace on line types 1 to 5 is very common in the official library, as these were introduced by the inlining function of at least one older authoring tool. In the past, I believe script that makes "~Moved to" files also put a leading space before the Type 1 line.
Pages: 1 2 3 4 5 6