Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-Compaction using Multi-Stage Query Engine #16291

Merged
merged 64 commits into from
Jul 12, 2024

Conversation

gargvishesh
Copy link
Contributor

@gargvishesh gargvishesh commented Apr 16, 2024

Description

Compaction operations issued by the Coordinator currently run using the native query engine. As majority of the advancements that we are making in batch ingestion are in MSQ, it is imperative that we support compaction on MSQ to make Compaction more robust and possibly faster. For instance, we have seen OOM errors in native compaction that MSQ could have handled by its auto-calculation of tuning parameters.

This PR enables compaction on MSQ to remove the dependency on native engine.

Main changes:

  • DataSourceCompactionConfig now has an additional field engine that can be one among [native, msq] with native being the default.
  • if engine is MSQ, CompactSegments duty assigns all available compaction task slots to the launched CompactionTask to ensure full capacity is available to MSQ. This is to avoid stalling which could happen in case a fraction of the tasks were allotted and they eventually fell short of the number of tasks required by the MSQ engine to run the compaction.
  • ClientCompactionTaskQuery has a new field ClientCompactionRunnerInfo with just one Engine subfield.
  • CompactionTask now has CompactionRunner interface instance with its implementations NativeCompactinRunner in core and MSQCompactionRunner in the druid-multi-stage-query extension. . The objectmapper deserializes ClientCompactionRunnerInfo in ClientCompactionTaskQuery to the CompactionRunner instance that is mapped to the specified type [native, msq].
  • CompactTask uses the CompactionRunner instance it receives to create the indexing tasks.
  • CompactionTask to MSQControllerTask conversion logic checks whether metrics are present in the segment schema. If present, the task is created with a native group-by query; if not, the task is issued with a scan query. The storeCompactionState flag is set in the context.
  • Each created MSQControllerTask is launched in-place and its TaskStatus tracked to determine the final status of
    the CompactionTask. The id of each of these tasks is the same as that of CompactionTask since otherwise, the workers will be unable to determine the controller task's location for communication (as they haven't been launched via the overlord).

Some things to note:

  • The context specified in DataSourceCompactionConfig is passed as is to the MSQControllerTask and hence can contain MSQ context params as well, with the exception of rowsPerSegment -- which will be overridden by either targetRowsPerSegment or maxRowsPerSegment if specified in a partitionsSpec.
  • maxRowsInMemory param is only considered if specified in the context. The value in DataSourceCompactionConfig.tuningConfig is not considered as it is set to a default value (1M) if unspecified by a user, so it is indistinguishable between coming from the user or via the default.
  • If no maxNumTasks value is specified in the taskContext, min(availableCompactionTaskSlots, 5) is allotted to MSQ compaction tasks.
  • rollup:true without any metricsSpec de-duplicates rows since all columns are then treated as dimensions -- just as in native compaction.

Currently unsupported for MSQ Compaction:

  • Update of cluster-level default compaction engine. Current default is native which can be updated at a per-datasource level. The cluster-level update API will come in a follow-up PR.
  • Grouping on multi-value columns. The flag groupByEnableMultiValueUnnesting is disabled. Only array-type columns are supported
  • partitionsSpec of type HashedParititionsSpec. Only DimensionRangePartitionsSpec and DynamicPartitionsSpec works.
  • maxTotalRows in DynamicPartitionsSpec. Only maxRowsPerSegment works.
  • rollup set to false in granularitySpec when metricsSpec is specified. Only rollup set to true works with a non-empty metricsSpec.
  • Aggregators in metricsSpec where fieldName != name OR (internal) aggregatorFactory.class() != aggregatorFactory.getCombiningFactory.class(). This is a major limitation that will be addressed in a follow-up PR

Release note


Key changed/added classes in this PR
  • MyFoo
  • OurBar
  • TheirBaz

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@github-actions github-actions bot added Area - Batch Ingestion Area - Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Apr 16, 2024
@kfaraz kfaraz self-requested a review April 16, 2024 06:11
@kfaraz
Copy link
Contributor

kfaraz commented Jul 2, 2024

I am wondering if there are unit tests that test the new msq compaction flow in the unit tests. I could find the spec validation tests only.

I think conversion to MSQ spec is the main job of the MSQCompactionRunner. Unit testing the entire flow doesn't really ensure something is broken in this class. I'm planning to cover the entire flow in ITs instead.

@gargvishesh , I concur with @LakshSingla on this one. We need UTs to test the entire flow with MSQ the same way we are doing with native. We may add ITs too but most Druid devs rely on UTs more heavily as the IT flow is a little flaky currently.

  • If you have already added tests for the new flow, please share that detail here.
  • If you are yet to start on those, you could just try to parameterize the engine the existing compaction flow tests and run the same existing tests with both native and msq engines. Please let me know if you need any assistance with this.

@@ -305,17 +305,30 @@ private CompactionStatus metricsSpecIsUpToDate()
if (ArrayUtils.isEmpty(configuredMetricsSpec)) {
return COMPLETE;
}
final AggregatorFactory[] configuredMetricsCombiningFactorySpec =
Copy link
Contributor

@kfaraz kfaraz Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we comparing just the combining factory now and not the original AggregatorFactory itself? Please add comments (and tests, if not already added) to clarity this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it as a javadoc for the function

Copy link
Contributor

@kfaraz kfaraz Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would advise retaining the old version of the code and doing this change in a follow up. In the current PR, I would prefer it if we didn't modify any of the logic for native compaction. What happens if we just stick to the old logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the last change to unsupport problematic aggregator factory definitions, this change is not required. Have reverted it.

@gargvishesh
Copy link
Contributor Author

@kfaraz @LakshSingla
Thanks for your comments -- I do see the usefulness of the UTs for the entire MSQ-based compaction flow. The existing tests in CompactionTaskRunTest however cannot be parameterised since MSQ code resides in an extension. The tests therefore need to be present in the extension itself and require some effort for implementation. I will take them up in a follow-up PR.

Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please resolve merge conflicts in the PR.

It has been introduced by a recent NPE bugfix in #16713 .

I think this bugfix is also included in the current PR but I decided merged it separately as it has been reported in community in a couple of places.

Sorry for the inconvenience!

# Conflicts:
#	server/src/main/java/org/apache/druid/server/coordinator/duty/CompactSegments.java
@gargvishesh gargvishesh requested a review from kfaraz July 11, 2024 06:35
Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for taking this PR to completion, @gargvishesh !
You have been a great sport with addressing the multiple rounds of feedback. 🚀

@kfaraz kfaraz merged commit 197c54f into apache:master Jul 12, 2024
88 checks passed
@gargvishesh
Copy link
Contributor Author

gargvishesh commented Jul 12, 2024

Thank you @kfaraz, @LakshSingla and @cryptoe for the time and effort to do all those rounds of reviews, esp. given the size of the PR.

kfaraz pushed a commit that referenced this pull request Jul 30, 2024
Follow-up to #16291, this commit enables a subset of existing native compaction ITs on the MSQ engine.

In the process, the following changes have been introduced in the MSQ compaction flow:
- Populate `metricsSpec` in `CompactionState` from `querySpec` in `MSQControllerTask` instead of `dataSchema`
- Add check for pre-rolled-up segments having `AggregatorFactory` with different input and output column names
- Fix passing missing cluster-by clause in scan queries
- Add annotation of `CompactionState` to tombstone segments
sreemanamala pushed a commit to sreemanamala/druid that referenced this pull request Aug 6, 2024
Description:
Compaction operations issued by the Coordinator currently run using the native query engine.
As majority of the advancements that we are making in batch ingestion are in MSQ, it is imperative
that we support compaction on MSQ to make Compaction more robust and possibly faster. 
For instance, we have seen OOM errors in native compaction that MSQ could have handled by its
auto-calculation of tuning parameters. 

This commit enables compaction on MSQ to remove the dependency on native engine. 

Main changes:
* `DataSourceCompactionConfig` now has an additional field `engine` that can be one of 
`[native, msq]` with `native` being the default.
*  if engine is MSQ, `CompactSegments` duty assigns all available compaction task slots to the
launched `CompactionTask` to ensure full capacity is available to MSQ. This is to avoid stalling which
could happen in case a fraction of the tasks were allotted and they eventually fell short of the number
of tasks required by the MSQ engine to run the compaction.
* `ClientCompactionTaskQuery` has a new field `compactionRunner` with just one `engine` field.
* `CompactionTask` now has `CompactionRunner` interface instance with its implementations
`NativeCompactinRunner` and `MSQCompactionRunner` in the `druid-multi-stage-query` extension.
The objectmapper deserializes `ClientCompactionRunnerInfo` in `ClientCompactionTaskQuery` to the
`CompactionRunner` instance that is mapped to the specified type [`native`, `msq`]. 
* `CompactTask` uses the `CompactionRunner` instance it receives to create the indexing tasks.
* `CompactionTask` to `MSQControllerTask` conversion logic checks whether metrics are present in 
the segment schema. If present, the task is created with a native group-by query; if not, the task is
issued with a scan query. The `storeCompactionState` flag is set in the context.
* Each created `MSQControllerTask` is launched in-place and its `TaskStatus` tracked to determine the
final status of the `CompactionTask`. The id of each of these tasks is the same as that of `CompactionTask`
since otherwise, the workers will be unable to determine the controller task's location for communication
(as they haven't been launched via the overlord).
sreemanamala pushed a commit to sreemanamala/druid that referenced this pull request Aug 6, 2024
Follow-up to apache#16291, this commit enables a subset of existing native compaction ITs on the MSQ engine.

In the process, the following changes have been introduced in the MSQ compaction flow:
- Populate `metricsSpec` in `CompactionState` from `querySpec` in `MSQControllerTask` instead of `dataSchema`
- Add check for pre-rolled-up segments having `AggregatorFactory` with different input and output column names
- Fix passing missing cluster-by clause in scan queries
- Add annotation of `CompactionState` to tombstone segments
@kfaraz kfaraz added this to the 31.0.0 milestone Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants