Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema.getNodePropertyDefault: also lookup in 'getPropertyOptionCompat' #152

Closed
wants to merge 2 commits into from

Conversation

mpollmeier
Copy link
Contributor

@mpollmeier mpollmeier commented Mar 6, 2024

Domain classes: currently generating as a list of if/else statements - given that it's two parameters the jvm will probably not compile it into a tableswitch. What do you think,
should we rather change it into a Map or a 2d array instead?

getPropertyOptionCompat needs to take the default value into account, so that the behaviour matches overflowdb.

generating as a list of if/else statements - given that it's two
parameters the jvm will probably not compile it into a tableswitch.
what do you think, should we rather change it into a Map or a
2d array instead?
@mpollmeier mpollmeier requested a review from bbrehm March 6, 2024 08:24
@mpollmeier mpollmeier changed the title Schema.getNodePropertyDefault (also generate for domain classes) Schema.getNodePropertyDefault: also lookup in 'getPropertyOptionCompat' Mar 6, 2024
@bbrehm
Copy link
Contributor

bbrehm commented Mar 6, 2024

Do we need this?

Currently nodePropertyDefaults are implemented by the constructor of NewCall et al, and the schema doesn't need to know about them.

The only reason odbv1 getPropertyOption needs to know about the defaults is a specific storage optimization to odbv1: If a property is equal to the default, then it is not serialized. Since the slot is still taken in the NodeDb class, it saves no heap memory, but it does save disk memory / network traffic.

Since we use columnar layout, compression and stringpools in flatgraph, an array of "almost all values are equal to the default" compresses down to roughly nothing, so that optimization is not required.

This does have a peculiar effect on schema changes, though:

Suppose we create a graph in odbv1, where we set some field of some node to "default_v1", which happens to be the default value. Then we serialize the graph.

Half a year later, we changed the schema: The default is now "default_v2". We now read the old file with a new version of our software, and query the property-value of the node.

In flatgraph, we will receive "default_v1"; in odbv1 we received "default_v2". Do we care a lot about this setting?

If we were to add this to the schema, I think the formalquantity/type would be a good place to store this (i.e. have case classes instead of objects).

@mpollmeier
Copy link
Contributor Author

Some more context: this got triggered by a red test while porting codescience: https://github.com/ShiftLeftSecurity/codescience/blob/master/cpgvalidator/src/test/scala/io.shiftleft.cpgvalidator/CpgValidatorTest.scala#L25

Without this, we get 8 additional 'cpg validation errors' because the mandatory property is not defined. Now, given that these days all mandatory properties are guaranteed to have a default value, I suppose the check itself is rather useless and can simply be dropped.

That being said we might still need this (or at least the change to getPropertyOptionCompat) because overflowdb did behave differently (i.e. lookup the default value), and that's the entire reason we have that 'compat' method, no?

@bbrehm
Copy link
Contributor

bbrehm commented Mar 6, 2024

Merde, this is a very good point:

You're probably generating a CPG with joern-schema, put it to disk, and then load it with codescience-schema?

Then we would trigger this issue, for mandatory properties with default that exist in codescience and that don't exist in joern.

OK, I'll think about it, you have convinced me that something must be done.

@mpollmeier mpollmeier closed this Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants