Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NOMERGE] [c++/python] Move DataFrame write path to use C++ bindings #2367

Closed
wants to merge 70 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
0f512d4
[python] Use bindings for `DenseNDArray` readpath
nguyenv Feb 13, 2024
6af74c5
Update `DenseNDArray` with new `SOMAContext`
nguyenv Feb 23, 2024
7ee29ed
Add documentation for _opener virtual method
nguyenv Feb 23, 2024
3412ff5
Correct documentation
nguyenv Feb 23, 2024
1fd5a5b
Pass in as kw args; replace virtual method
nguyenv Feb 23, 2024
7185a5e
Run formatting
nguyenv Feb 23, 2024
614d886
Correct soma_array.cc post-merge
nguyenv Feb 23, 2024
6f12efd
Changes according to review
nguyenv Feb 24, 2024
a8c3836
Move under _wrapper_type
nguyenv Feb 24, 2024
7a0c6b2
[python] Use bindings for `SparseNDArray` readpath
nguyenv Feb 13, 2024
71e13f5
WIP export to c
nguyenv Feb 22, 2024
86c2dec
WIP arrow type to tiledb type converter
nguyenv Feb 24, 2024
9b39e9e
WIP create dimensions and attrs
nguyenv Feb 24, 2024
c0506c3
WIP change SOMADataFrame to take ArrowSchema
nguyenv Feb 24, 2024
e484bf3
Use ArrowSchema instead of TileDB Schema to create
nguyenv Feb 24, 2024
f581fad
WIP attach
nguyenv Feb 25, 2024
84b8006
WIP replace with setup_write
nguyenv Feb 25, 2024
82593a0
WIP create domain / extents
nguyenv Feb 26, 2024
13ab9a4
Use ColumnIndexInfo instead
nguyenv Feb 26, 2024
7a32937
Create clib.SOMADataFrame
nguyenv Feb 26, 2024
eab7652
WIP metadata issue unpacking values
nguyenv Feb 26, 2024
3297b70
Fix metadata
nguyenv Feb 26, 2024
1e68d6d
WIP set_data takes in offsets and validity
nguyenv Feb 26, 2024
77412e6
WIP write with passed in offsets and validities
nguyenv Feb 26, 2024
974f0fa
WIP handle vars correctly
nguyenv Feb 27, 2024
212955b
WIP fix segfaults from var-sized writes and metadata
nguyenv Feb 27, 2024
004a06a
WIP create methods should be void
nguyenv Feb 28, 2024
10a0d32
WIP fix errors related to span indexing past length for offset
nguyenv Feb 28, 2024
9a8b096
WIP add timestamps to create functions
nguyenv Feb 29, 2024
15d649d
WIP correct metadata for soma array
nguyenv Feb 29, 2024
d6c8c72
[WIP] Refactor metadata
nguyenv Feb 29, 2024
6f5ec4b
WIP handle _read_nonempty_domain from c++
nguyenv Mar 7, 2024
9bff446
WIP handle nullable attrs
nguyenv Mar 7, 2024
6ae926d
WIP fill validity buffer if nullptr
nguyenv Mar 7, 2024
23a81f2
WIP add enumerations to ArraySchema
nguyenv Mar 7, 2024
d4d88e0
WIP check that column to write is enum
nguyenv Mar 7, 2024
27f2614
WIP extend enumerations on write
nguyenv Mar 8, 2024
f316647
WIP only extend enmr when present
nguyenv Mar 8, 2024
f8a2d4a
WIP do not extend if no values present
nguyenv Mar 8, 2024
a3d39c2
WIP
nguyenv Mar 11, 2024
b88d580
Add common unit test file
nguyenv Mar 12, 2024
68aa70a
WIP pass domain and extents as StructArray
nguyenv Mar 12, 2024
7277a0f
WIP correct offset buffer
nguyenv Mar 12, 2024
42bb754
Throw TileDBErrors as TileDBSOMAErrors
nguyenv Mar 12, 2024
ed83905
WIP use ASCII for string dims
nguyenv Mar 12, 2024
eebb40d
Handle extending enumerations for non-var attrs
nguyenv Mar 12, 2024
5ad8293
Unsupport arrow types should throw TypeError
nguyenv Mar 13, 2024
4fc8f9a
Cast domains and extents to correct types
nguyenv Mar 13, 2024
84fb0f0
WIP
nguyenv Mar 14, 2024
09b08d7
Correctly throw SOMAError
nguyenv Mar 14, 2024
11e6a05
WIP correctly update metadata values
nguyenv Mar 18, 2024
a139025
WIP cast pyarrow boolean to uint8 when writing to tiledb array
nguyenv Mar 19, 2024
5256955
WIP fix existing enum error; fix byte display issue
nguyenv Mar 19, 2024
77fb92a
Order dimensions in index column name order
nguyenv Mar 19, 2024
774f66a
Clean up enum extend code
nguyenv Mar 21, 2024
d46ed00
Fix more issues with metadata in write mode
nguyenv Mar 21, 2024
14e4846
Correct metadata delete
nguyenv Mar 21, 2024
0b029d7
Read in config
nguyenv Mar 25, 2024
5417c17
WIP add platform config in C++
nguyenv Mar 26, 2024
2bb57f1
WIP
nguyenv Mar 26, 2024
67a0b54
Fix writes when using slice of arrow table
nguyenv Mar 27, 2024
6ce288a
WIP wrong: do not use capacity but actual max
nguyenv Mar 28, 2024
f804644
WIP do not extend enumerations past limit for index dtype
nguyenv Mar 29, 2024
482c083
WIP clears buffers after running
nguyenv Mar 29, 2024
6f1f07c
WIP
nguyenv Apr 1, 2024
4243537
WIP update enumeration index values when extending
nguyenv Apr 2, 2024
fc35aba
Bind the dataframe.create with the timestamp
nguyenv Apr 3, 2024
6637b82
WIP fix several errors for macos
nguyenv Apr 5, 2024
ef8d1ce
Correct Boolean value writes for enum values
nguyenv Apr 5, 2024
031f9a9
Add in missing RuntimeError
nguyenv Apr 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion apis/python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,7 @@ def run(self):
library_dirs=LIB_DIRS,
libraries=["tiledbsoma"] + (["tiledb"] if os.name == "nt" else []),
extra_link_args=CXX_FLAGS,
extra_compile_args=["-std=c++17" if os.name != "nt" else "/std:c++17"]
extra_compile_args=["-std=c++17" if os.name != "nt" else "/std:c++17", "-g"]
+ CXX_FLAGS,
language="c++",
)
Expand Down
2 changes: 1 addition & 1 deletion apis/python/src/tiledbsoma/_arrow_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@
if attr.enum_label is not None: # enumerated
if A is None:
A = tiledb.open(uri, ctx=ctx)
info = A.enum(name)
info = A.enum(attr.enum_label)

Check warning on line 172 in apis/python/src/tiledbsoma/_arrow_types.py

View check run for this annotation

Codecov / codecov/patch

apis/python/src/tiledbsoma/_arrow_types.py#L172

Added line #L172 was not covered by tests
arrow_schema_dict[name] = pa.dictionary(
index_type=arrow_type_from_tiledb_dtype(attr.dtype),
value_type=arrow_type_from_tiledb_dtype(
Expand Down
10 changes: 3 additions & 7 deletions apis/python/src/tiledbsoma/_collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -434,13 +434,9 @@ def __getitem__(self, key: str) -> CollectionElementType:
context = self.context
timestamp = self.tiledb_timestamp_ms

try:
wrapper = _tdb_handles.open(uri, mode, context, timestamp)
entry.soma = _factory.reify_handle(wrapper)
except SOMAError:
entry.soma = _factory._open_internal(
entry.entry.wrapper_type.open, uri, mode, context, timestamp
)
wrapper = _tdb_handles.open(uri, mode, context, timestamp)
entry.soma = _factory.reify_handle(wrapper)

# Since we just opened this object, we own it and should close it.
self._close_stack.enter_context(entry.soma)
return cast(CollectionElementType, entry.soma)
Expand Down
Loading
Loading