Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SNOW-1882616] Error out for duplicate keys in variant #929

Merged
merged 2 commits into from
Jan 16, 2025

Conversation

sfc-gh-ggeng
Copy link
Contributor

The SDK should not ingest the variant like {"key": 1, "key": 2} otherwise, we would have issues in the XP during select and there is no good mitigation on that.

In this change, we would error out with INVALID_VALUE_ROW for such duplicate cases.

Added Unit test for the change.

@sfc-gh-ggeng sfc-gh-ggeng requested review from sfc-gh-tzhang and a team as code owners January 14, 2025 23:47
@@ -856,6 +856,29 @@ public void testValidateAndParseObject() throws Exception {
() -> validateAndParseObjectNew("COL", Collections.singletonMap("foo", new Object()), 0));
}

@Test
public void testValidateDuplicateKeys() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a test that tests validateAndParseVariantNew, as well as validateAndParseArrayNew (array of objects with duplicated keys)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing this up. Added!

@@ -176,6 +177,10 @@ private static String validateAndParseSemiStructured(
throw valueFormatNotAllowedException(
columnName, snowflakeType, "Not a valid JSON", insertRowIndex);
} catch (IOException e) {
if (e.getMessage().contains("Duplicate field")) {
throw valueFormatNotAllowedException(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's 'chain' exceptions like you do below: pass the original e exception to the constructor of the one you throw. This will preserve information contained in the original exception.

I wonder if the "Not a valid JSON: duplicate field" message you're adding to the child exception becomes redundant in this case: the original one already has similar information. Keep or edit or delete, up to you.

Copy link
Contributor

@sfc-gh-lsembera sfc-gh-lsembera Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfc-gh-dorlovsky We don't want to preserve information here, it may lead to customer data being exposed in client side logs.

if (e.getMessage().contains("Duplicate field")) {
throw valueFormatNotAllowedException(
columnName, snowflakeType, "Not a valid JSON: duplicate field", insertRowIndex);
}
throw new SFException(e, ErrorCode.IO_ERROR, "Cannot create JSON Parser or JSON generator");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: would it be useful to add columnName, snowflakeType, and insertRowIndex to this exception too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out! Added the info

Copy link
Contributor

@sfc-gh-tzhang sfc-gh-tzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@sfc-gh-ggeng sfc-gh-ggeng merged commit 6c3190c into master Jan 16, 2025
48 checks passed
@sfc-gh-ggeng sfc-gh-ggeng deleted the ggeng_SNOW-1882616-duplicate-keys branch January 16, 2025 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants