Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add reverse converter #224

Merged
merged 2 commits into from
Jul 11, 2024
Merged

add reverse converter #224

merged 2 commits into from
Jul 11, 2024

Conversation

bbrehm
Copy link
Contributor

@bbrehm bbrehm commented Jul 11, 2024

This does three things:

  1. Fixes some bugs in the odb->flatgraph conversion. This probably didn't come up in our tests so far, because the bugs were in the containedNode conversion code, and these are only used post improvements in codescience.
  2. Remove the legacyId from the conversion code. We can consider adding that in again?
  3. Add code to convert flatgraph to odb. Main known limitation at this point is the issue that the file format doesn't store the name of edge properties, so this cannot be faithful.

When testing, note that we don't recreate the odb indices on conversion. Odb notices that, and then eagerly deserializes all nodes. If any nodes have latent schema violations, then this will immediately throw on cpg loading, as opposed to the behavior on loading a "real" cpg. This came out because I used a very old ocular version for testing, and CONTAINS edges in IMPLICIT_CALL nodes are new ;)

This still contains some debug code that I'll remove before merging.

@bbrehm bbrehm requested a review from mpollmeier July 11, 2024 10:19
@mpollmeier
Copy link
Contributor

related: ShiftLeftSecurity/overflowdb#439

readNode(legacyIdToNewId, stringInterner, node.id(), bytes, byLabel, storage)
} catch {
case exc: Throwable =>
println(s"fuck ${node.seq()} / ${node.nodeKind} / ${node.label()} / ${node.id()}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess this is part of your "debug output" that will be removed / censored before merge? :)

@mpollmeier
Copy link
Contributor

Seems to work!
I created a cpg with joern-flatgraph:

joern> importCode("/home/mp/tmp/cpgtesting/hsl-springboot/HelloShiftLeft")

Then converted with this PR to overflowdb:

odb-convert/target/universal/stage/bin/flatgraph-odb-convert -r /home/mp/Projects/shiftleft/joern/workspace/HelloShiftLeft/cpg.bin /home/mp/Projects/shiftleft/joern/workspace/HelloShiftLeft/cpg.bin.odb

Then loaded into joern master (to verify that it'll fail):

joern> importCpg("/home/mp/Projects/shiftleft/joern/workspace/HelloShiftLeft/cpg.bin.odb")
...
Error loading CPG
java.lang.RuntimeException: java.lang.RuntimeException: Edge REACHING_DEF does not support property `EDGE_PROPERTY`.

Then again on joern master with ShiftLeftSecurity/overflowdb#439 applied:

joern> importCpg("/home/mp/Projects/shiftleft/joern/workspace/HelloShiftLeft/cpg.bin.odb")
...

val res0: Option[io.shiftleft.codepropertygraph.generated.Cpg] = Some(value = Cpg (Graph [45618 nodes]))

joern> cpg.graph.edges.size
val res1: Int = 350033

joern> cpg.graph.edges.count(_.propertiesMap.size == 1)
val res2: Int = 72812

joern> cpg.graph.edges.count(_.propertiesMap.size == 2)
val res3: Int = 0

Copy link
Contributor

@mpollmeier mpollmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after you removed all swear words

@bbrehm
Copy link
Contributor Author

bbrehm commented Jul 11, 2024

Swear words censored, added a debug option to dump the manifest (very useful if something goes wrong).

@bbrehm bbrehm merged commit 852eeb9 into master Jul 11, 2024
1 check passed
@max-leuthaeuser max-leuthaeuser deleted the bbrehm/reverseConvert branch August 5, 2024 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants