Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid assuming binary representation of C structs #116

Open
plimkilde opened this issue Jul 21, 2021 · 4 comments
Open

Avoid assuming binary representation of C structs #116

plimkilde opened this issue Jul 21, 2021 · 4 comments

Comments

@plimkilde
Copy link

The spec uses C struct declarations in a few places to describe certain data structures in LAS files. This may be problematic since it could seem implied that the structs as given are directly serializable/deserializable. The binary representation of such C structs is ABI-dependent, with variable endianness and padding bytes.

While the use of C structs would be fine for documenting a LAS-handling C library, that is not what the LAS specification describes. I think it would be more appropriate to replace them with simple tables of fields instead.

The data structures currently described using C structs are:

  • sGeoKeys and sKeyEntry under the GeoKeyDirectoryTag Record
  • CLASSIFICATION under the Classification Lookup VLR
  • EXTRA_BYTES under the Extra Bytes VLR
@hobu
Copy link
Contributor

hobu commented Sep 28, 2021

I would continue to argue for C-style struct definitions. The rationale for this is the language and meaning of them is very precise, and it is clear to software implementors what the definitions intent and interpretation is. The GeoKeyDirectoryTag is taken directly from the GeoTIFF specification, for example. It makes little sense to change it.

It is my understanding that the original pre-1.0 LAS specification, the file format was intended to be slurped up as mmap'd data structures in C. An old timer could correct us on that. It doesn't seem worth the trouble to change this.

@esilvia
Copy link
Member

esilvia commented Oct 14, 2021

I'm inclined to agree with @hobu on this, but I confess I'm biased as I'm primarily a C++ developer myself and I found the C-style structs helpful when writing my own code.

That said, if you'd like to draft/propose an alternative method of documentation, I'd be happy to entertain it.

@plimkilde
Copy link
Author

One approach that retains the C struct notation is used in the current COPC spec draft (kudos @hobu), which adds this caveat:

Some of the file format is described using C-language fixed width integer types. Groups of entities are denoted with a C-language struct, though all data is packed on byte boundaries and encoded as little-endian values, which may not be the case for a C program that uses the same notation.

My issue with the C struct notation is not that it is C-centric, but rather that C or C++ programmers reading it should not assume that they can generally cast or mmap a bag of LAS bytes into a valid struct. I do get that it's a convenient way of communicating the data model, however.

@esilvia
Copy link
Member

esilvia commented Oct 15, 2021

Thanks @plimkilde that does help.

If you'd like to propose that clarification to the data types section, you can find it here: https://github.com/ASPRSorg/LAS/blob/draft-1.4-R16/source/02.03_datatypes.sub

Alternatively, if you think it makes more sense to include it with the C-style structs, those can be found here:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants