Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Parsing and Writing Tests for V3 Metadata #11947

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

HonahX
Copy link
Contributor

@HonahX HonahX commented Jan 10, 2025

Fixes #10764

  • Check schema compatibility when parsing V3 Metadata
  • Add new unit tests for V3 metadata
  • Refactor TableMetadata tests by parametrizing table format versions in all tests
  • Add MetadataTestUtils.TableMetadataBuilder and MetadataTestUtils.BaseSnapshotBuilder to simply test metadata and snapshot build. This will reduce the amount of test code changes when adding new fields to metadata/snapshot.

Not included:

@github-actions github-actions bot added the core label Jan 10, 2025
@HonahX HonahX marked this pull request as ready for review January 10, 2025 19:54
@@ -949,6 +972,8 @@ private Builder(int formatVersion) {
this.schemasById = Maps.newHashMap();
this.specsById = Maps.newHashMap();
this.sortOrdersById = Maps.newHashMap();
this.rowLinageEnabled = false;
this.nextRowId = -1L;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically, next-row-id can start from any integer, including negative ones. I am thinking may be in the implementation we could make it starts from 0 to simplify the implementation and make it more intuitive.

@HonahX
Copy link
Contributor Author

HonahX commented Jan 10, 2025

cc @RussellSpitzer @flyrain
May I ask for your help to review this?

@@ -240,6 +242,13 @@ public static void toJson(TableMetadata metadata, JsonGenerator generator) throw
}
generator.writeEndArray();

if (metadata.formatVersion() == 3) {
generator.writeBooleanField(ROW_LINEAGE, metadata.rowLinageEnabled());
if (metadata.nextRowId() > -1L) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check if the row lineage is enable as well as a defensive programming practice? Or we have to carry on the next row id no matter whether row lineage is enabled or not? cc @RussellSpitzer

Copy link
Contributor Author

@HonahX HonahX Jan 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing this. Based on offline sync with Russell, I removed row-lineage related changes from this PR. Row-lineage related metadata changes will be included in #11948. Sorry for the confusion

return new TableMetadataBuilder(formatVersion);
}

public static class TableMetadataBuilder {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added test-only builders for TableMetadata and BaseSnapshot. This shall reduce the amount of code changes in tests when adding new fields.

cc: @RussellSpitzer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parsing and Writing Tests for V3 Metadata
2 participants