Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: Bump datafusion, prost, hyper, tonic, tower, axum #5417

Open
wants to merge 124 commits into
base: main
Choose a base branch
from

Conversation

evenyag
Copy link
Contributor

@evenyag evenyag commented Jan 21, 2025

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

This PR bumps a lot of crates:

  • datafusion 2464703c84c400a09cc59277018813f0e797bb4e (After 43.0.0)
  • sqllparser 0.52.x
  • prost 0.13
  • hyper 1.x
  • tonic 0.12
  • tower 0.5
  • axum 0.8
  • arrow 53
  • parquet 53

There are too many changes due to many breaking changes in these crates. Here are some of them:

  • Implements Debug for plans and related structs.
  • Uses SessionStateBuilder to build SessionState
  • Removes the usage of HandleErrorLayer.
  • Implements From<arrow interval> for interval types.
  • Adds a new method new_with_count() to RecordBatch to allow constructing an empty RecordBatch that only contains row count.
  • DFLogicalSubstraitConvertor doesn't require catalog list now.
  • Supports Some(Vec::new()) projection in mito.
  • mito uses min_bytes_opt, max_bytes_opt, min_opt, max_opt.
  • ColumnSchema will ignore the metadata of time index if it doesn't have timestamp type. (For SQL CAST).
  • Some protobuf structs support Copy now.
  • DfContextProviderAdapter implements all missing methods.
  • Uses SortExpr and the new function API of datafusion.
  • Uses Bytes to replace RawBody in axum handlers.
  • Refactors all middleware for gRPC server and HTTP server.
  • Updates all sqlness results.
  • udf functions with zero arguements need Nullary signature now

Breaking changes

  • id and tag are SQL keywords now

Notes

  • The HTTP server doesn't support setting the TCP keepalive timeout. We may need to check if it has regressions.
  • DatafusionQueryEngine::optimize_physical_plan() doesn't optimize the physical plan now, as the planner already optimize it.

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.
  • API changes are backward compatible.
  • Schema or data changes are backward compatible.

waynexia and others added 30 commits December 6, 2024 16:33
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
* remove handle_error handler
* refactor timeout layer
* serve axum
* boxed()
* use TokioIo
* feat: exclude script crate

* chore: simplify feature

* feat: remove the script crate

* chore: remove python feature and some comments

* chore: fix warning
src/common/meta/src/state_store.rs Show resolved Hide resolved
src/mito2/src/read/projection.rs Show resolved Hide resolved
Comment on lines +153 to +163
let runtime = if cfg!(debug_assertions) {
// Set the stack size to 8MB for the thread so it wouldn't overflow on large stack usage in debug mode
// This is necessary to avoid stack overflow while running sqlness.
// https://github.com/rust-lang/rust/issues/34283
builder
.thread_stack_size(8 * 1024 * 1024)
.build()
.context(BuildRuntimeSnafu)?
} else {
builder.build().context(BuildRuntimeSnafu)?
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering who is overflowing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can check this backtrace. Maybe the LogicalPlan consumes the stack a lot and the rewrite() call finally overflow.
sqlness-overflow.txt

src/mito2/src/cache/file_cache.rs Show resolved Hide resolved
src/sql/src/util.rs Outdated Show resolved Hide resolved
tests-integration/src/otlp.rs Outdated Show resolved Hide resolved
tests-integration/src/otlp.rs Outdated Show resolved Hide resolved
src/common/meta/src/state_store.rs Show resolved Hide resolved
src/flow/src/expr/relation/func.rs Show resolved Hide resolved
src/flow/src/df_optimizer.rs Show resolved Hide resolved
src/flow/src/transform/aggr.rs Show resolved Hide resolved
src/flow/src/transform/literal.rs Show resolved Hide resolved
src/flow/src/transform/literal.rs Outdated Show resolved Hide resolved
src/flow/src/transform/literal.rs Outdated Show resolved Hide resolved
src/operator/src/statement/admin.rs Outdated Show resolved Hide resolved
src/query/src/datafusion.rs Show resolved Hide resolved
src/flow/src/expr/func.rs Show resolved Hide resolved
@@ -179,6 +187,7 @@ mod test {

pub fn create_test_ctx() -> FlownodeContext {
let mut tri_map = IdToNameMap::new();
// deprecated: use `numbers_with_ts` instead since this table has no timestamp column
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave a fixme here @discord9 , don't forget to update the tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave a fixme here @discord9 , don't forget to update the tests.

will fix this in a separate PR once this get merged since it' unrelated to update datafusion

src/flow/src/transform/literal.rs Show resolved Hide resolved
src/flow/src/transform/literal.rs Outdated Show resolved Hide resolved
src/mito2/src/read/projection.rs Outdated Show resolved Hide resolved
src/mito2/src/sst/parquet/format.rs Outdated Show resolved Hide resolved
@@ -195,6 +195,37 @@ impl ScalarCalculate {
}
}

impl PartialOrd for ScalarCalculate {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NonBlocking: maybe we can add some macros to help implement PartialOrd for plans.

src/servers/Cargo.toml Outdated Show resolved Hide resolved
Comment on lines +400 to +406
tint TINYINT,
sint SMALLINT,
i INT,
bint INT64,
bint BIGINT,
v VARCHAR,
f FLOAT,
d FLOAT64,
d DOUBLE,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we have some document about these that need to be updated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we don't have to do that. This is the expected behavior implemented in #2331

We forgot to fix this when we updated the sqlparser. You can find out some types have alias but some do not in this test. Now this PR fixes this and removes the TODO. cc @killme2008

Copy link
Member

@waynexia waynexia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed them and hereby cast my vote 🫡

image

src/query/src/range_select/plan.rs Outdated Show resolved Hide resolved
@evenyag
Copy link
Contributor Author

evenyag commented Jan 22, 2025

rustc may panic...
https://github.com/GreptimeTeam/greptimedb/actions/runs/12908415598/job/35994232050

I'll retry that test.

Copy link
Collaborator

@fengjiachun fengjiachun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change This pull request contains breaking changes. docs-not-required This change does not impact docs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants