Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ALTER TABLE .. RENAME COLUMN .. #2906

Closed
wants to merge 1,287 commits into from
Closed

feat: ALTER TABLE .. RENAME COLUMN .. #2906

wants to merge 1,287 commits into from

Conversation

vrongmeal
Copy link
Contributor

This doesn't work since we need to update the delta metadata as well with the updated schema which I haven't been able to figure out how...

Fixes: #2900

tychoish and others added 30 commits December 27, 2023 11:42
a number of improvements to the cli highlighter. 

- better support for types
- more support for various symbols `&|+-/%` 
- function highlighting!


![demo](https://github.com/GlareDB/glaredb/assets/21327470/39024a73-5aec-46c8-8e96-532568b36cdc)
```sql
-- List schemas
select * from list_schemas(external_db);

-- List tables in schema
select * from list_tables(external_db, dbo);

-- List columns for a given table
select * from list_columns(external_db, dbo, abc);
```

---------

Signed-off-by: Vaibhav <[email protected]>
Adds support for clickhouse as a data source.

`read_clickhouse`:
```
> select * from read_clickhouse('clickhouse://localhost:9000/default', 'bikeshare_stations') limit 1;
┌────────────┬──────────┬────────┬─────────┬───┬─────────────────┬───────┬──────────────────┬───────────────────┐
│ station_id │ name     │ status │ address │ … │ footprint_width │ notes │ council_district │ modified_date     │
│         ── │ ──       │ ──     │ ──      │   │              ── │ ──    │               ── │ ──                │
│      Int32 │ Utf8     │ Utf8   │ Utf8    │   │         Float32 │ Utf8  │            Int32 │ Timestamp<s, UTC> │
╞════════════╪══════════╪════════╪═════════╪═══╪═════════════════╪═══════╪══════════════════╪═══════════════════╡
│          0 │ South C… │ active │ 1901 S… │ … │            10.0 │ In t… │                9 │ 2022-03-04T09:01… │
└────────────┴──────────┴────────┴─────────┴───┴─────────────────┴───────┴──────────────────┴───────────────────┘
```

External database:
```
> create external database ch
::: from clickhouse
::: options ( connection_string = 'clickhouse://localhost:9000/default' );
Database created
> select status, address from ch.default.bikeshare_stations limit 1;
┌────────┬──────────────────────────┐
│ status │ address                  │
│ ──     │ ──                       │
│ Utf8   │ Utf8                     │
╞════════╪══════════════════════════╡
│ active │ 1901 South Congress Ave. │
└────────┴──────────────────────────┘
```

External table:
```
> create external table stations
::: from clickhouse
::: options ( connection_string = 'clickhouse://localhost:9000/default',
:::           table = 'bikeshare_stations' );
Table created
> select council_district, modified_date from stations limit 1;
┌──────────────────┬─────────────────────┐
│ council_district │ modified_date       │
│               ── │ ──                  │
│            Int32 │ Timestamp<s, UTC>   │
╞══════════════════╪═════════════════════╡
│                9 │ 2022-03-04T09:01:00 │
└──────────────────┴─────────────────────┘
```

---

Follow up items: #2315
Adds partitioning (sharding) of result sets using the hashing method. 

Closes #2220
- Closes #2330
- Closes #2328
- Closes #2329
- Closes #2327
- Closes #2325
- Closes #2326
- Closes #2323
- Closes #2324
- Closes #2322

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Fixes some lints, primarily `.get(0)` -> `.first()`.

Some interesting stuff with async in traits:
https://blog.rust-lang.org/2023/12/21/async-fn-rpit-in-traits.html
partially implements #2332 

This PR is limited to only the table function. 

Inserts & `CREATE EXTERNAL` will be done in separate PR's
- Closes #2357
- Closes #2358
- Closes #2355
- Closes #2356
- Closes #2354
- Closes #2352
- Closes #2353
- Closes #2351

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Instead of duplicating the function definition itself.

Also a few cleanups.

Existing SLTs assert that this works as expected. SLTs for
`parquet_scan`, etc are just copies of the `read_parquet` ones.
```
> select * from read_csv('./testdata/csv/delimiter.csv', delimiter => ';');
┌───────┬──────────────┬─────────┐
│  col1 │ col2         │    col3 │
│    ── │ ──           │      ── │
│ Int64 │ Utf8         │ Float64 │
╞═══════╪══════════════╪═════════╡
│     1 │ hello, world │ 3.90000 │
│     2 │ HELLO, WORLD │ 4.90000 │
└───────┴──────────────┴─────────┘
```

Depends on #2364

Adds the plumbing for getting options from the user in our various scan
functions. The only one wired up right now is 'delimiter' for csv, but
other options should be very straightforward to add in now.
- Adds back in examples/descriptions for some of the read_* functions
- Adds some tests asserting that we're actually getting entries in the
functions table.
- Fixes panic when using non-table function as a table factor.
Closes #2368

Futzing around with function aliases. This replaces `HashMap` with
`AliasMap` to allow multiple keys to point to the same object. Also adds
some tests.

I needed to add a workaround for 'array_to_string' for one of the
iterators, see #2371.
closes #2373 

```sql
> select function_name, parameters from glare_catalog.functions WHERE function_name = 'read_excel';
┌───────────────┬─────────────────────────────────────────────────────────────────────────┐
│ function_name │ parameters                                                              │
│ ──            │ ──                                                                      │
│ Utf8          │ List<Utf8>                                                              │
╞═══════════════╪═════════════════════════════════════════════════════════════════════════╡
│ read_excel    │ [Utf8, Utf8, sheet_name: Utf8, infer_rows: UInt64, has_header: Boolean] │
└───────────────┴─────────────────────────────────────────────────────────────────────────┘
```
I realized I branched off `slt-ci` instead of main for this. So this
depends on #2363

closes #2346
universalmind303 and others added 28 commits March 27, 2024 19:14
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Still a few items todo

- [x] get functions removed from `BuiltinScalarFunctions` added back in
(Isnan, Encode, ...)
- [x] doublecheck `array_append` functionality
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
closes #2777

Note: 
`ident` preserves case sensitivity while `col` does not.
)

The `HEAD` method might not be supported. Ignore client side errors and
assume we have `0` content length.

Fixes: #2828

---------

Signed-off-by: Vaibhav <[email protected]>
Co-authored-by: Sam Kleinman <[email protected]>
- Closes #2894
- Closes #2893
- Closes #2892
- Closes #2891
- Closes #2890
- Closes #2889
- Closes #2887
- Closes #2886
- Closes #2884
- Closes #2883
- Closes #2881
- Closes #2880

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This fixes a bug introduced by datafusion 36

previously some operations that failed during optimization such as
invalid casts now fail at runtime.

Since they now fail at runtime, it means that we would still create the
catalog and table and only fail during insert afterwards. This left both
the catalog and the storage in a bad state that didn't accurately
reflect the operation. e.g.

```sql
create table invalid_ctas as (select cast('test' as int) as 'bad_cast');
```

This updated the catalog and created a table for `invalid_ctas`, but
when you'd query it you would get an error.


This PR makes sure that the operation is successful before committing
the changes. It does so by exposing some new methods on the catalog
client. `commit_state` `mutate_and_commit` and `mutate` instead of the
previous `mutate`.

The existing code was refactored to use the `mutate_and_commit` which is
the same as the old `mutate`. The code that requires the commit
semantics (create table) now uses `mutate` to first get an uncommitted
catalog state with those changes, then does all of it's other actions

- create the "native" table
- _Optional_ inserts into the table
- commits the catalog state

If any of the operations before the commit fail, then the catalog
mutations are never committed.

---------

Co-authored-by: Sean Smith <[email protected]>
This doesn't work since we need to update the delta metadata as well with
the updated schema which I haven't been able to figure out how...

Fixes: #2900

Signed-off-by: Vaibhav <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: rename columns
10 participants