Skip to content

Commit

Permalink
Merge branch 'main' of github.com:single-cell-data/TileDB-SOMA
Browse files Browse the repository at this point in the history
  • Loading branch information
johnkerl committed May 13, 2024
2 parents de0b46e + e90f555 commit 07a32c3
Show file tree
Hide file tree
Showing 21 changed files with 196 additions and 113 deletions.
10 changes: 8 additions & 2 deletions .github/workflows/python-ci-minimal.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,14 @@ on:
branches:
- main
- 'release-*'
paths-ignore:
- 'apis/r/**'
paths:
- '**'
- '!**.md'
- '!apis/r/**'
- '!docs/**'
- '!.github/**'
- '.github/workflows/python-ci-minimal.yml'
- '.github/workflows/python-ci-single.yml'
workflow_dispatch:

jobs:
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/python-ci-packaging.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ jobs:
run: |
mkdir -p external
# Please do not edit manually -- let scripts/update-tiledb-version.py update this
wget --quiet https://github.com/TileDB-Inc/TileDB/releases/download/2.22.0/tiledb-linux-x86_64-2.22.0-52e981e.tar.gz
wget --quiet https://github.com/TileDB-Inc/TileDB/releases/download/2.23.0/tiledb-linux-x86_64-2.23.0-152093b.tar.gz
tar -C external -xzf tiledb-linux-x86_64-*.tar.gz
ls external/lib/
echo "LD_LIBRARY_PATH=$(pwd)/external/lib" >> $GITHUB_ENV
Expand Down Expand Up @@ -169,7 +169,7 @@ jobs:
run: |
mkdir -p external
# Please do not edit manually -- let scripts/update-tiledb-version.py update this
wget --quiet https://github.com/TileDB-Inc/TileDB/releases/download/2.22.0/tiledb-macos-x86_64-2.22.0-52e981e.tar.gz
wget --quiet https://github.com/TileDB-Inc/TileDB/releases/download/2.23.0/tiledb-macos-x86_64-2.23.0-152093b.tar.gz
tar -C external -xzf tiledb-macos-x86_64-*.tar.gz
ls external/lib/
echo "DYLD_LIBRARY_PATH=$(pwd)/external/lib" >> $GITHUB_ENV
Expand Down Expand Up @@ -260,10 +260,10 @@ jobs:
if [ `uname -s` == "Darwin" ];
then
# Please do not edit manually -- let scripts/update-tiledb-version.py update this
wget --quiet https://github.com/TileDB-Inc/TileDB/releases/download/2.22.0/tiledb-macos-x86_64-2.22.0-52e981e.tar.gz
wget --quiet https://github.com/TileDB-Inc/TileDB/releases/download/2.23.0/tiledb-macos-x86_64-2.23.0-152093b.tar.gz
else
# Please do not edit manually -- let scripts/update-tiledb-version.py update this
wget --quiet https://github.com/TileDB-Inc/TileDB/releases/download/2.22.0/tiledb-linux-x86_64-2.22.0-52e981e.tar.gz
wget --quiet https://github.com/TileDB-Inc/TileDB/releases/download/2.23.0/tiledb-linux-x86_64-2.23.0-152093b.tar.gz
fi
tar -C external -xzf tiledb-*.tar.gz
ls external/lib/
Expand Down
11 changes: 7 additions & 4 deletions .github/workflows/r-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,13 @@ name: TileDB-SOMA R CI

on:
pull_request:
paths-ignore:
- "apis/python/**"
- ".pre-commit-config.yaml"
paths:
- '**'
- '!**.md'
- '!apis/python/**'
- '!docs/**'
- '!.github/**'
- '.github/workflows/r-ci.yml'
push:
branches:
- main
Expand Down Expand Up @@ -73,7 +77,6 @@ jobs:
- name: Install BioConductor package SingleCellExperiment
run: cd apis/r && tools/r-ci.sh install_bioc SingleCellExperiment


# Uncomment these next two stanzas as needed whenever we've just released a new tiledb-r for
# which source is available but CRAN releases (and hence update r2u binaries) are not yet:
#
Expand Down
15 changes: 14 additions & 1 deletion .github/workflows/r-python-interop-testing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,21 @@ jobs:
- name: Bootstrap
run: cd apis/r && tools/r-ci.sh bootstrap

- name: Set additional repositories (Linux)
if: ${{ matrix.os == 'ubuntu-latest' }}
run: |
rversion <- paste(strsplit(as.character(getRversion()), split = '\\.')[[1L]][1:2], collapse = '.')
codename <- system('. /etc/os-release; echo ${VERSION_CODENAME}', intern = TRUE)
repo <- "https://tiledb-inc.r-universe.dev"
(opt <- sprintf('options(repos = c("%s/bin/linux/%s/%s", "%s", getOption("repos")))', repo, codename, rversion, repo))
cat(opt, "\n", file = "~/.Rprofile", append = TRUE)
shell: Rscript {0}

- name: Install tiledb-r
run: cd apis/r && Rscript tools/install-tiledb-r.R

- name: Dependencies
run: cd apis/r && tools/r-ci.sh install_all
run: cd apis/r && Rscript -e "remotes::install_deps(dependencies = TRUE, upgrade = FALSE)"

- name: CMake
uses: lukka/get-cmake@latest
Expand Down
2 changes: 1 addition & 1 deletion apis/python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -331,7 +331,7 @@ def run(self):
"scipy",
# Note: the somacore version is in .pre-commit-config.yaml too
"somacore==1.0.11",
"tiledb~=0.28.0",
"tiledb~=0.29.0",
"typing-extensions", # Note "-" even though `import typing_extensions`
],
extras_require={
Expand Down
3 changes: 2 additions & 1 deletion apis/python/src/tiledbsoma/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@
)
from ._indexer import IntIndexer, tiledbsoma_build_index
from ._measurement import Measurement
from ._sparse_nd_array import SparseNDArray
from ._sparse_nd_array import SparseNDArray, SparseNDArrayRead
from .options import SOMATileDBContext, TileDBCreateOptions
from .pytiledbsoma import (
tiledbsoma_stats_disable,
Expand Down Expand Up @@ -205,6 +205,7 @@
"SOMAError",
"SOMATileDBContext",
"SparseNDArray",
"SparseNDArrayRead",
"TileDBCreateOptions",
"tiledbsoma_build_index",
"tiledbsoma_stats_disable",
Expand Down
14 changes: 9 additions & 5 deletions apis/python/src/tiledbsoma/_dense_nd_array.py
Original file line number Diff line number Diff line change
Expand Up @@ -265,21 +265,25 @@ def write(

clib_dense_array = self._handle._handle

# Compute the coordinates for the dense array.
new_coords: List[Union[int, Slice[int], None]] = []
for c in coords:
if isinstance(c, slice) and isinstance(c.stop, int):
new_coords.append(slice(c.start, c.stop - 1, c.step))
else:
new_coords.append(c)

# Convert data to a numpy array.
dtype = self.schema.field("soma_data").type.to_pandas_dtype()
input = np.array(values, dtype=dtype)

order = (
clib.ResultOrder.colmajor
if input.flags.f_contiguous
else clib.ResultOrder.rowmajor
)
# Set the result order. If neither row nor col major, set to be row major.
if input.flags.f_contiguous:
order = clib.ResultOrder.colmajor
else:
if not input.flags.contiguous:
input = np.ascontiguousarray(input)
order = clib.ResultOrder.rowmajor
clib_dense_array.reset(result_order=order)

self._set_reader_coords(clib_dense_array, new_coords)
Expand Down
3 changes: 2 additions & 1 deletion apis/python/src/tiledbsoma/_sparse_nd_array.py
Original file line number Diff line number Diff line change
Expand Up @@ -506,7 +506,8 @@ def __init__(


class SparseNDArrayRead(_SparseNDArrayReadBase):
"""Intermediate type to choose result format when reading a sparse array.
""":class:`SparseNDArrayRead` is an intermediate type which supports multiple eventual result formats
when reading a sparse array.
Results returned by `coos` and `tables` iterate over COO coordinates in the user-specified result order,
but with breaks between iterator steps at arbitrary coordinates (i.e., any given result may split a row or
Expand Down
31 changes: 21 additions & 10 deletions apis/python/src/tiledbsoma/io/ingest.py
Original file line number Diff line number Diff line change
Expand Up @@ -261,22 +261,27 @@ def from_h5ad(
context: Optional :class:`SOMATileDBContext` containing storage parameters, etc.
platform_config: Platform-specific options used to create this array, provided in the form
``{"tiledb": {"create": {"sparse_nd_array_dim_zstd_level": 7}}}`` nested keys.
``{\"tiledb\": {\"create\": {\"sparse_nd_array_dim_zstd_level\": 7}}}``.
obs_id_name and var_id_name: Which AnnData ``obs`` and ``var`` columns, respectively, to use
for append mode: values of this column will be used to decide which obs/var rows in appended
inputs are distinct from the ones already stored, for the assignment of ``soma_joinid``. If
this column exists in the input data, as a named index or a non-index column name, it will
be used. If this column doesn't exist in the input data, and if the index is nameless or
named ``index``, that index will be given this name when written to the SOMA experiment's
``obs`` / ``var``. NOTE: it is not necessary for this column to be the index-column
name in the input AnnData objects ``obs``/``var``.
obs_id_name/var_id_name: Which AnnData ``obs`` and ``var`` columns, respectively, to use
for append mode.
Values of this column will be used to decide which obs/var rows in appended
inputs are distinct from the ones already stored, for the assignment of ``soma_joinid``. If
this column exists in the input data, as a named index or a non-index column name, it will
be used. If this column doesn't exist in the input data, and if the index is nameless or
named ``index``, that index will be given this name when written to the SOMA experiment's
``obs`` / ``var``.
NOTE: it is not necessary for this column to be the index-column
name in the input AnnData objects ``obs``/``var``.
X_layer_name: SOMA array name for the AnnData's ``X`` matrix.
raw_X_layer_name: SOMA array name for the AnnData's ``raw/X`` matrix.
ingest_mode: The ingestion type to perform:
- ``write``: Writes all data, creating new layers if the SOMA already exists.
- ``resume``: Adds data to an existing SOMA, skipping writing data
that was previously written. Useful for continuing after a partial
Expand All @@ -286,12 +291,14 @@ def from_h5ad(
multiple H5AD files to a single SOMA.
X_kind: Which type of matrix is used to store dense X data from the
H5AD file: ``DenseNDArray`` or ``SparseNDArray``.
H5AD file: ``DenseNDArray`` or ``SparseNDArray``.
registration_mapping: Does not need to be supplied when ingesting a single
H5AD/AnnData object into a single :class:`Experiment`. When multiple inputs
are to be ingested into a single experiment, there are two steps. First:
.. code-block:: python
import tiledbsoma.io
rd = tiledbsoma.io.register_h5ads(
experiment_uri,
Expand All @@ -305,6 +312,8 @@ def from_h5ad(
Once that's been done, the data ingests per se may be done in any order,
or in parallel, via for each ``h5ad_file_name``:
.. code-block:: python
tiledbsoma.io.from_h5ad(
experiment_uri,
h5ad_file_name,
Expand All @@ -321,6 +330,8 @@ def from_h5ad(
This is a coarse-grained mechanism for setting key-value pairs on all SOMA objects in an
``Experiment`` hierarchy. Metadata for particular objects is more commonly set like:
.. code-block:: python
with soma.open(uri, 'w') as exp:
exp.metadata.update({"aaa": "BBB"})
exp.obs.metadata.update({"ccc": 123})
Expand Down
35 changes: 35 additions & 0 deletions apis/python/tests/test_regression.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
"""Testing module for regression tests"""


import numpy as np
import pyarrow as pa

import tiledbsoma as soma


def test_nd_dense_non_contiguous_write(tmp_path):
"""Test regression dected in GitHub Issue #2537"""
# Create data.
data = (
np.arange(np.product(24), dtype=np.uint8)
.reshape((4, 3, 2))
.transpose((2, 0, 1))
)
coords = tuple(slice(0, dim_len) for dim_len in data.shape)
tensor = pa.Tensor.from_numpy(data)

# Create array and write data to it.
with soma.DenseNDArray.create(
tmp_path.as_posix(), type=pa.uint8(), shape=data.shape
) as array:
array.write(coords, tensor)

# Check the data is correct when we read it back.
with soma.DenseNDArray.open(tmp_path.as_posix()) as array:
result = array.read(coords)
np.testing.assert_equal(data, result.to_numpy())

# Check the data is correct when we read it back.
with soma.DenseNDArray.open(tmp_path.as_posix()) as array:
result = array.read(coords, result_order="column-major")
np.testing.assert_equal(data.transpose(), result.to_numpy())
2 changes: 1 addition & 1 deletion apis/r/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Imports:
Matrix,
stats,
bit64,
tiledb (>= 0.26.0), tiledb (<= 0.26.99),
tiledb (>= 0.27.0), tiledb (<= 0.27.99),
arrow,
utils,
fs,
Expand Down
6 changes: 3 additions & 3 deletions apis/r/tools/get_tarball.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,14 @@ isLinux <- Sys.info()["sysname"] == "Linux"
if (isMac) {
arch <- system('uname -m', intern = TRUE)
if (arch == "x86_64") {
url <- "https://github.com/TileDB-Inc/TileDB/releases/download/2.22.0/tiledb-macos-x86_64-2.22.0-52e981e.tar.gz"
url <- "https://github.com/TileDB-Inc/TileDB/releases/download/2.23.0/tiledb-macos-x86_64-2.23.0-152093b.tar.gz"
} else if (arch == "arm64") {
url <- "https://github.com/TileDB-Inc/TileDB/releases/download/2.22.0/tiledb-macos-arm64-2.22.0-52e981e.tar.gz"
url <- "https://github.com/TileDB-Inc/TileDB/releases/download/2.23.0/tiledb-macos-arm64-2.23.0-152093b.tar.gz"
} else {
stop("Unsupported Mac architecture. Please have TileDB Core installed locally.")
}
} else if (isLinux) {
url <- "https://github.com/TileDB-Inc/TileDB/releases/download/2.22.0/tiledb-linux-x86_64-2.22.0-52e981e.tar.gz"
url <- "https://github.com/TileDB-Inc/TileDB/releases/download/2.23.0/tiledb-linux-x86_64-2.23.0-152093b.tar.gz"
} else {
stop("Unsupported platform for downloading artifacts. Please have TileDB Core installed locally.")
}
Expand Down
9 changes: 9 additions & 0 deletions apis/r/tools/install-tiledb-r.R
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,13 @@ ctb <- contrib.url(getOption("repos"))
names(ctb) <- getOption("repos")
dbr <- db[idx[valid], 'Repository']
(repos <- names(ctb[ctb %in% dbr]))

# BSPM doesn't respect `repos`
# Check to see if any repo w/ valid tiledb-r is CRAN
# If not, turn of BSPM
cran <- getOption("repos")['CRAN']
cran[is.na(cran)] <- ""
if (requireNamespace("bspm", quietly = TRUE) && !any(dbr %in% cran)) {
bspm::disable()
}
utils::install.packages("tiledb", repos = repos)
30 changes: 19 additions & 11 deletions doc/README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,27 @@
# Basic on-laptop setup

```
pip install .
Build the docs with:
```bash
./local-build.sh
```

This is very important -- _for developers_, the nominal use-mode is `python setup.py develop` as documented in our [../apis/python/README.md](../apis/python/README.md). But this local build _will not find_ `tiledbsoma-py` from this local install. You must install `pip install apis/python so it can find Python source for document autogen, _and_ you must re-run `pip install apis/python after each and every source-file edit, even if you're just doing an edit-build-preview iteration loop in a sandbox checkout.
The first time you run this, it will:
1. Create and activate a virtualenv (`venv/`)
2. Install [`requirements_doc.txt`](requirements_doc.txt)
3. Install `..apis/python` (editable)
4. Build the docs (output to `doc/html/`)

```
#!/bin/bash
set -euo pipefail
sphinx-build -E -T -b html -d foo/doctrees -D language=en doc/source doc/html
```
Subsequent runs will only perform the 4th step (unless `-r`/`--reinstall` is passed).

Once the docs are built, you can:

```bash
open source/_build/html/index.html
```
#!/bin/bash
set -euo pipefail
open doc/html/python-api.html
or e.g.:
```bash
http-server source/_build/html &
open http://localhost:8080/
```

and inspect them.
Loading

0 comments on commit 07a32c3

Please sign in to comment.