Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[c++, r] Update and refactor nanoarrow #2188

Merged
merged 39 commits into from
Apr 3, 2024
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
fbc9f5f
Update nanoarrow vendored files to nanoarrow 0.4.0
eddelbuettel Feb 19, 2024
9a5e048
Low-level wiring of nanoarrow at sr_* level
eddelbuettel Mar 1, 2024
0d77614
More lower-level wiring of nanoarrow
eddelbuettel Mar 1, 2024
7d37e6b
WIP snapshot with nanoarrow wired into libtiledbsoma
eddelbuettel Mar 6, 2024
ea0adee
Ensure nullable is set correctly in either case
eddelbuettel Mar 7, 2024
7cbbb9d
Context wrapped in a special purpose struct should not finalize
eddelbuettel Mar 7, 2024
b86ff81
Simpler and faster r-ci.yaml
eddelbuettel Mar 7, 2024
9bc9f05
Use nanoarrow 0.4.0 consistently
eddelbuettel Mar 7, 2024
4f5dbe1
Refined arrow_adapter
eddelbuettel Mar 8, 2024
ffb8cd3
Set increased timeout for download.file to survive GH flakyness
eddelbuettel Mar 8, 2024
4454f82
Turn trace back of, do not include carrow in cli
eddelbuettel Mar 8, 2024
1986c45
Do not include carrow.h in reindexer.cc
eddelbuettel Mar 8, 2024
1267833
WIP changes expanding type map, suppressing schema release
eddelbuettel Mar 18, 2024
3a4add7
[c++] Fix segfault issues
nguyenv Mar 15, 2024
757746b
Add additional necessary strdup
nguyenv Mar 18, 2024
5fcafa4
No longer to protect one statement
eddelbuettel Mar 19, 2024
a7f476f
Support TILEDB_DATETIME_DAY aka Date as well
eddelbuettel Mar 19, 2024
531156a
Meh
eddelbuettel Mar 19, 2024
7db58ff
Meh with version 14.0.0 and not 14.0.6 because ... sure
eddelbuettel Mar 19, 2024
9ab125c
Remove initialization setters covered by nanoarrow use
eddelbuettel Mar 19, 2024
bd4ed2a
Ensure DATETIME columns get Arrow coltype reset
eddelbuettel Mar 20, 2024
7f999ba
Add more date and datetime support
eddelbuettel Mar 21, 2024
48c3816
Additional conversion
eddelbuettel Mar 26, 2024
cab5bf0
Post-rebase change
eddelbuettel Mar 27, 2024
ee445f9
Heeding time to the lord of linting is time well spent some say
eddelbuettel Mar 27, 2024
71d1a75
Heeding time to the lord of linting is time well spent some say
eddelbuettel Mar 27, 2024
54e7953
Correct another delete to free
eddelbuettel Mar 27, 2024
06d61e1
Additional non-nullptr protection
eddelbuettel Mar 28, 2024
f64de8e
make format
eddelbuettel Apr 1, 2024
d6efd8c
Additional test conditioner
eddelbuettel Apr 1, 2024
ace4cc7
Correcting one buffer size selection
eddelbuettel Apr 2, 2024
1c78d37
make format
eddelbuettel Apr 2, 2024
c98a396
Remove carrow.h and reference to it
eddelbuettel Apr 2, 2024
445589f
Cleanups
eddelbuettel Apr 2, 2024
4abee1b
Use nanoarrow.{c,hpp} via tiledbsoma/utils/
eddelbuettel Apr 2, 2024
8a6929d
Re-activate -Werror
eddelbuettel Apr 2, 2024
e60a6c3
Chore
eddelbuettel Apr 2, 2024
5450dfe
High-productivity afternoon
eddelbuettel Apr 2, 2024
7bb42fb
Correct an format string error message
eddelbuettel Apr 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions .github/workflows/r-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -79,13 +79,13 @@ jobs:
# if: ${{ matrix.os != 'macOS-latest' }}
# run: cd apis/r && Rscript -e "options(bspm.version.check=TRUE); install.packages('tiledb', repos = c('https://eddelbuettel.r-universe.dev/bin/linux/jammy/4.3/', 'https://cloud.r-project.org'))"

- name: Install r-universe build of SeuratObject (macOS)
if: ${{ matrix.os == 'macOS-latest' }}
run: cd apis/r && Rscript -e "install.packages('SeuratObject', repos = c('https://mojaveazure.r-universe.dev', 'https://cloud.r-project.org'))"
#- name: Install r-universe build of SeuratObject (macOS)
eddelbuettel marked this conversation as resolved.
Show resolved Hide resolved
# if: ${{ matrix.os == 'macOS-latest' }}
# run: cd apis/r && Rscript -e "install.packages('SeuratObject', repos = c('https://mojaveazure.r-universe.dev', 'https://cloud.r-project.org'))"

- name: Install r-universe build of SeuratObject (linux)
if: ${{ matrix.os == 'ubuntu-latest' }}
run: cd apis/r && Rscript -e "options(bspm.version.check=TRUE); install.packages('SeuratObject', repos = c('https://mojaveazure.r-universe.dev/bin/linux/jammy/4.3/', 'https://cloud.r-project.org'))"
#- name: Install r-universe build of SeuratObject (linux)
# if: ${{ matrix.os == 'ubuntu-latest' }}
# run: cd apis/r && Rscript -e "options(bspm.version.check=TRUE); install.packages('SeuratObject', repos = c('https://mojaveazure.r-universe.dev/bin/linux/jammy/4.3/', 'https://cloud.r-project.org'))"

- name: Dependencies
run: cd apis/r && tools/r-ci.sh install_all
Expand Down
2 changes: 1 addition & 1 deletion apis/python/src/tiledbsoma/reindexer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
*/

#include <tiledbsoma/reindexer/reindexer.h>
#include <tiledbsoma/utils/carrow.h>
// #include <tiledbsoma/utils/carrow.h>
eddelbuettel marked this conversation as resolved.
Show resolved Hide resolved
#include "common.h"

#define DENUM(x) .value(#x, TILEDB_##x)
Expand Down
6 changes: 4 additions & 2 deletions apis/r/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -45,11 +45,13 @@ Imports:
spdl,
rlang,
tools,
tibble
tibble,
nanoarrow
LinkingTo:
Rcpp,
RcppSpdlog,
RcppInt64
RcppInt64,
nanoarrow
Additional_repositories: https://ghrr.github.io/drat
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
Expand Down
1 change: 1 addition & 0 deletions apis/r/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ export(tiledbsoma_stats_show)
export(write_soma)
import(R6)
import(methods)
import(nanoarrow)
import(utils)
importFrom(Matrix,as.matrix)
importFrom(Matrix,sparseMatrix)
Expand Down
6 changes: 6 additions & 0 deletions apis/r/R/RcppExports.R
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,12 @@
.Call(`_tiledbsoma_sr_complete`, sr)
}

#' @noRd
#' @import nanoarrow
create_empty_arrow_table <- function() {
.Call(`_tiledbsoma_create_empty_arrow_table`)

Check warning on line 98 in apis/r/R/RcppExports.R

View check run for this annotation

Codecov / codecov/patch

apis/r/R/RcppExports.R#L98

Added line #L98 was not covered by tests
}

sr_next <- function(sr) {
.Call(`_tiledbsoma_sr_next`, sr)
}
Expand Down
11 changes: 6 additions & 5 deletions apis/r/R/utils-arrow.R
Original file line number Diff line number Diff line change
Expand Up @@ -66,12 +66,14 @@ tiledb_type_from_arrow_type <- function(x, is_dim) {
utf8 = "UTF8",
string = "UTF8",
large_utf8 = "UTF8",
# date32 = "date32",
# based on what TileDB supports
date32 = "DATETIME_DAY",
# date64 = "date64",
# time32 = "time32",
# time64 = "time64",
# null = "null",
# timestamp = "timestamp",
# based on what TileDB supports with a default msec res.
timestamp = "DATETIME_MS",
# decimal128 = "decimal128",
# decimal256 = "decimal256",
# struct = "struct",
Expand Down Expand Up @@ -240,11 +242,10 @@ arrow_schema_from_tiledb_schema <- function(x) {
arrow::schema(c(dimfields, attfields))
}

#' Validate external pointer to ArrowArray
#' Validate external pointer to ArrowArray which is embedded in a nanoarrow S3 type
#' @noRd
check_arrow_pointers <- function(arrlst) {
stopifnot("First argument must be an external pointer to ArrowArray" = check_arrow_array_tag(arrlst[[1]]),
"Second argument must be an external pointer to ArrowSchema" = check_arrow_schema_tag(arrlst[[2]]))
stopifnot(inherits(arrlst, "nanoarrow_array"))
}

#' Validate compatibility of Arrow data types
Expand Down
7 changes: 3 additions & 4 deletions apis/r/R/utils-readerTransformers.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,13 @@
#'
#' @description Converts the results of a \link{soma_array_reader} or
#' \link{sr_next} to an arrow::\link[arrow]{Table}
#' @param x A List object with two pointers to Arrow array data and schema
#' @param x A nanoarrow_array object which is itself a wrapper around the external pointer
#' to the Arrow array data; the schema external pointer is added to it as well
#' @return arrow::\link[arrow]{Table}
#' @noRd
soma_array_to_arrow_table <- function(x) {
check_arrow_pointers(x)
arrow::as_arrow_table(
arrow::RecordBatch$import_from_c(x$array_data, x$schema)
)
arrow::as_arrow_table(x)
}

#' Transformer function: Arrow table to Matrix::sparseMatrix
Expand Down
15 changes: 13 additions & 2 deletions apis/r/src/RcppExports.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
#endif

// soma_array_reader
Rcpp::List soma_array_reader(const std::string& uri, Rcpp::Nullable<Rcpp::CharacterVector> colnames, Rcpp::Nullable<Rcpp::XPtr<tiledb::QueryCondition>> qc, Rcpp::Nullable<Rcpp::List> dim_points, Rcpp::Nullable<Rcpp::List> dim_ranges, std::string batch_size, std::string result_order, const std::string& loglevel, Rcpp::Nullable<Rcpp::CharacterVector> config);
nanoarrowXPtr soma_array_reader(const std::string& uri, Rcpp::Nullable<Rcpp::CharacterVector> colnames, Rcpp::Nullable<Rcpp::XPtr<tiledb::QueryCondition>> qc, Rcpp::Nullable<Rcpp::List> dim_points, Rcpp::Nullable<Rcpp::List> dim_ranges, std::string batch_size, std::string result_order, const std::string& loglevel, Rcpp::Nullable<Rcpp::CharacterVector> config);
RcppExport SEXP _tiledbsoma_soma_array_reader(SEXP uriSEXP, SEXP colnamesSEXP, SEXP qcSEXP, SEXP dim_pointsSEXP, SEXP dim_rangesSEXP, SEXP batch_sizeSEXP, SEXP result_orderSEXP, SEXP loglevelSEXP, SEXP configSEXP) {
BEGIN_RCPP
Rcpp::RObject rcpp_result_gen;
Expand Down Expand Up @@ -129,8 +129,18 @@
return rcpp_result_gen;
END_RCPP
}
// create_empty_arrow_table
nanoarrowXPtr create_empty_arrow_table();
RcppExport SEXP _tiledbsoma_create_empty_arrow_table() {
BEGIN_RCPP
Rcpp::RObject rcpp_result_gen;
Rcpp::RNGScope rcpp_rngScope_gen;
rcpp_result_gen = Rcpp::wrap(create_empty_arrow_table());
return rcpp_result_gen;
END_RCPP

Check warning on line 140 in apis/r/src/RcppExports.cpp

View check run for this annotation

Codecov / codecov/patch

apis/r/src/RcppExports.cpp#L134-L140

Added lines #L134 - L140 were not covered by tests
}
// sr_next
Rcpp::List sr_next(Rcpp::XPtr<tdbs::SOMAArray> sr);
nanoarrowXPtr sr_next(Rcpp::XPtr<tdbs::SOMAArray> sr);
RcppExport SEXP _tiledbsoma_sr_next(SEXP srSEXP) {
BEGIN_RCPP
Rcpp::RObject rcpp_result_gen;
Expand Down Expand Up @@ -220,6 +230,7 @@
{"_tiledbsoma_shape", (DL_FUNC) &_tiledbsoma_shape, 2},
{"_tiledbsoma_sr_setup", (DL_FUNC) &_tiledbsoma_sr_setup, 10},
{"_tiledbsoma_sr_complete", (DL_FUNC) &_tiledbsoma_sr_complete, 1},
{"_tiledbsoma_create_empty_arrow_table", (DL_FUNC) &_tiledbsoma_create_empty_arrow_table, 0},
{"_tiledbsoma_sr_next", (DL_FUNC) &_tiledbsoma_sr_next, 1},
{"_tiledbsoma_tiledbsoma_stats_enable", (DL_FUNC) &_tiledbsoma_tiledbsoma_stats_enable, 0},
{"_tiledbsoma_tiledbsoma_stats_disable", (DL_FUNC) &_tiledbsoma_tiledbsoma_stats_disable, 0},
Expand Down
Loading
Loading