Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance unnest support in decoupled mode #17550

Merged
merged 113 commits into from
Jan 17, 2025

Conversation

kgyrtkirk
Copy link
Member

  • adds DruidRelFieldTrimmer
  • Unnest now uses unnestFieldType instead of blindly taking the rowType
  • removed the possibly problematic column reuse in UnnestCleanupRule (the trimmer should take care of that)
  • lots of iq file changes because earlier no proper trimming was done below LogicalCorrelate and Unnest -s

@kgyrtkirk kgyrtkirk marked this pull request as ready for review December 16, 2024 07:50
import java.util.List;
import java.util.Set;

public class DruidRelFieldTrimmer extends RelFieldTrimmer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A javadoc for this class would be really useful

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a few lines; main goal of the class is the same as RelFieldTrimmer - so I've added a link to it in the apidoc as all of those also applies here

.columnTypes(ColumnType.LONG)
.context(defaultScanQueryContext(
queryContext,
RowSignature.builder().add("v0", ColumnType.LONG).build()
RowSignature.builder().add("__time", ColumnType.LONG).build()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not understand why the column name changes here. I don't see any other change to the test

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no columns are needed from the left hand side; but Calcite has some tweaks here and there to avoid relnodes with 0 columns.

This PR suppresses the introduction of a column with the 0 value by the Fieldtrimmer
but in one of the rules there is also an unconditional projection of the 1st columns

which causes this change.

We should be able to handle the case of empty columns - I wanted to dig into that more deeply ; as it seems like there are also some rule combinations which may lead to an empty column set...(but forgot the testcase)

I think that in general Calcite should be able to handle these things - and we should only make
corrections in the execution engine if it causes issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this test was removed recently as it was a defaults mode only case

@@ -92,7 +92,8 @@ enum Modes
SORT_REMOVE_CONSTANT_KEYS_CONFLICT(DruidException.class, "not enough rules"),
REQUIRE_TIME_CONDITION(CannotBuildQueryException.class, "requireTimeCondition is enabled"),
UNNEST_INLINED(Exception.class, "Missing conversion is Uncollect"),
UNNEST_RESULT_MISMATCH(AssertionError.class, "(Result count mismatch|column content mismatch)");
UNNEST_RESULT_MISMATCH(AssertionError.class, "(Result count mismatch|column content mismatch)"),
RESULT_MISMATCH_NATIVE_UNNEST_INCORRECT_RESULTS(Throwable.class, "(Result count mismatch|column content mismatch|ARRAY)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this required? Won't it be covered by the previous one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed the enum key to have a better name ; and updated its matching pattern

Add a test to ensure that CalciteJoinQuery can handle join queries with input ref conditions.
…ported

- Modified the @NotYetSupported annotation to @NotYetSupported(Modes.UNNEST_PREDICATE_NOT_SUPPORTED) in CalciteArraysQueryTest.java.
- Updated the @DecoupledTestConfig annotation to @DecoupledTestConfig(ignoreExpectedQueriesReason = IgnoreQueriesReason.PREDICATE_NOT_SUPPORTED) in CalciteArraysQueryTest.java.
- These changes reflect the need to support unnest predicates in the future.
@cryptoe
Copy link
Contributor

cryptoe commented Jan 15, 2025

@adarshsanjeev @kgyrtkirk Lets merge this post the code freeze. So probably tomorrow.

@cryptoe cryptoe merged commit e9782b1 into apache:master Jan 17, 2025
77 checks passed
ashwintumma23 pushed a commit to ashwintumma23/druid that referenced this pull request Jan 20, 2025

1.    adds DruidRelFieldTrimmer
2.    Unnest now uses unnestFieldType instead of blindly taking the rowType
3.    removed the possibly problematic column reuse in UnnestCleanupRule (the trimmer should take care of that)
4.    lots of iq file changes because earlier no proper trimming was done below LogicalCorrelate and Unnest -s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Area - Querying
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants