Skip to content

Commit

Permalink
Multi table (#410)
Browse files Browse the repository at this point in the history
* [pre-commit.ci] pre-commit autoupdate (#394)

* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/psf/black: 23.10.1 → 23.11.0](psf/black@23.10.1...23.11.0)
- [github.com/pre-commit/mirrors-prettier: v3.0.3 → v3.1.0](pre-commit/mirrors-prettier@v3.0.3...v3.1.0)
- [github.com/pre-commit/mirrors-mypy: v1.6.1 → v1.7.0](pre-commit/mirrors-mypy@v1.6.1...v1.7.0)
- [github.com/astral-sh/ruff-pre-commit: v0.1.3 → v0.1.6](astral-sh/ruff-pre-commit@v0.1.3...v0.1.6)

* ficx pre-precommit

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: giovp <[email protected]>

* initial tests multi_table design

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add mock new table

* create test class and cleanup

* additional cleanup

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* additional cleanup

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add pseudo methods

* Change table type in init

* make tables plural and add to validation in __init__

* revert to old public accessor

* Validate each table in dictionary

* iterate dict values

* add comment

* adjust table getter

* Add tables getter

* Fix missing parenthesis

* change to warnings.warn DeprecationWarning

* allow for backward compatibility in init

* [pre-commit.ci] pre-commit autoupdate (#408)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* fix dict subscriptable

* fix string representation of sdata

* add deprecation decorator for future

* Allow for tables not annotating elements

* switch to using tables with deprecation

* fix string representation

* write tables element group

* adjust io to multi_table

* Alter io to give None as default value for spatialdata attrs keys

* add tables setter

* raise keyerror table getter

* remove commented tables setter

* raise keyerror in table deleter

* add deprecation warning

* fix tests

* add DeprecationWarning

* comment test

* change setter into method

* circumvent mappingproxy set issue

* adjust set get test

* add get table keys

* add column getters

* add change set target table

* Give default table name

* fix spatialdata without table

* add int32 because of windows and add docstring

* fix filtering by coordinate system

* Change to Path to not be linux / mac specific

* Change to Path to not be linux / mac specific

* table should annotate existing element

* return table with AnnData having 0 rows

* Adjust for windows

* adjust for accessing table elements

* fix change annotation target

* fix set annotation target

* fix/add tests

* fix init from elements

* fix init from elements tests

* add validation check

* add table validation SpatialData.__init

* fix ruff

* only concatenate if annotating

* change into warning because of filtering

* fix last tests

* adjust to tables

* use tables parameter

* fix some mypy

* some mypy fixes

* some more mypy

* fix another mypy

* circumvent typing error on py3.9

* mypy yet again

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix pre_commit

* down to 12 mypy errors

* down to 1mypy error

* fixed mypy errors

* fix set_table_annotation

* added docstring

* refactor data loader (#299)

Co-authored-by: LucaMarconato <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Luca Marconato <[email protected]>

* add documentation

* add documentation

* minor adjustment docstring

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added / adjusted docstrings

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix mypy after merge

* refactor function name

This is to avoid confusion. Many not easily resolved errors are created if we let this function generate table values. This makes it clear that
only spatial element values are generated and not tables. This in opposite to gen_elements which does return tables as well.

* [pre-commit.ci] pre-commit autoupdate (#411)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* small fixes

* added gen_elements docstrings

* tiny comments

* fix ruff pre-commit

* removed types from docstring

* refactor of set_table_annotation_target

* add quotes

* fix (?)

* refactor error messages

* fix incremental update (#329)

Co-authored-by: Wouter-Michiel Vierdag <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* add concatenate argument

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add util functions to init

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add util functions to init

* add tables class

* add table class

* add deprecation back

* rename function in tests

* rename function in tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix precommit

* update precommit and remove add_table, store_table and general fixes

* adjust tables init to incremental update pr

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix deletion of deprecated table

* revert filter change

* add public generators

* adjust to public generators

* Find element uses public generator

* add validation in sdata for tables

* add deprecation version number

* fix mypy errors

* Fix backing when deleting table

* Fix mypy

* cleanup

* chance target_element_name to region

* refactor test

* adjust concatenate regarding not concatenating tables

* add utility function

* concatenate if in multiple sdata objects

* minor docstring refactor

* fix import

* concatenate with tables

* cleanup

* fix test

* [pre-commit.ci] pre-commit autoupdate (#415)

updates:
- [github.com/psf/black: 23.11.0 → 23.12.1](psf/black@23.11.0...23.12.1)
- [github.com/pre-commit/mirrors-prettier: v4.0.0-alpha.4 → v4.0.0-alpha.8](pre-commit/mirrors-prettier@v4.0.0-alpha.4...v4.0.0-alpha.8)
- [github.com/pre-commit/mirrors-mypy: v1.7.1 → v1.8.0](pre-commit/mirrors-mypy@v1.7.1...v1.8.0)
- [github.com/astral-sh/ruff-pre-commit: v0.1.7 → v0.1.9](astral-sh/ruff-pre-commit@v0.1.7...v0.1.9)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* fixed typings; made pytest raises explicit

* minor fixes human readable strings

* fixed tests

* fix

* Add changes to changelog

* make private

* Remove commented code

* add cache to ignore

* updated changelog with giovp old pr

* refactor into private function

* Fix docstring

* Fix import

* specify key reuse in docstring

* add orphan_table argument

* Change docstring

* remove todo

* add example

* change concatenate logic

* updated changelog

* Allow force-overwriting existing files (non-backing) (#344)

* Add test for writing unbacked data over existing files

* Protect overwriting existing file only if it is backing file

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Simplify assertion, remove try/except

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed pre-commit

* added get_dask_backing_files(); improved sdata.write with overwrite=True

* fix docs

* changed version in changelog

* fix exception string

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Luca Marconato <[email protected]>
Co-authored-by: LucaMarconato <[email protected]>

* Added error message for removed add_elements functions (#420)

* added error message for removed add_elements functions

* moved _error_message_add_element() to _utils

* added validate and set region key

* fix docstring

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added public functions Spatialdata

* fix tests

* add docstrings

* Added subset API; fix behavior with zero-len table (#426)

* added subset API, returning None instead of empty table for APIs with  filter_table=True

* fix 3.9

* [pre-commit.ci] pre-commit autoupdate (#424)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.1.9 → v0.1.11](astral-sh/ruff-pre-commit@v0.1.9...v0.1.11)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* remove docstring typehint

* Warn user over overwrite in docstring

* Fix query of 2D/3D data with 2D/3D bounding box (#409)

* wip

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* wip

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix 2d/3d bb for raster data

* support for 2d/3d bb for 2d/3d points

* better tests

* applied suggestions from giovp

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added comments

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] pre-commit autoupdate (#430)

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.1.11 → v0.1.13](astral-sh/ruff-pre-commit@v0.1.11...v0.1.13)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* updated changelog

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactor subset

* remove todo

* Made _locate_spatial_element public, renamed to locate_element() (#427)

* made _locate_spatial_element public, renamed to locate_element()

* returning path instead of tuple in locate_element()

* updated changelog

* locate_elements() now returns a list

* fix test

* Update test_and_deploy.yaml (#434)

Triggering the tests for pull requests to any branch.

* change docstring

* fix query test

* add todo

* refactor filter_by_coordinate_system

* test filter with keep table

* adjust docstring

* adjust docstring

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: giovp <[email protected]>
Co-authored-by: Giovanni Palla <[email protected]>
Co-authored-by: LucaMarconato <[email protected]>
Co-authored-by: Luca Marconato <[email protected]>
Co-authored-by: aeisenbarth <[email protected]>
  • Loading branch information
7 people authored Jan 17, 2024
1 parent 33d8320 commit 75d66f1
Show file tree
Hide file tree
Showing 44 changed files with 2,675 additions and 1,465 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/test_and_deploy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ on:
tags:
- "v*" # Push events to matching v*, i.e. v1.0, v20.15.10
pull_request:
branches: [main]
branches: "*"

jobs:
test:
Expand Down Expand Up @@ -63,7 +63,7 @@ jobs:
PLATFORM: ${{ matrix.os }}
DISPLAY: :42
run: |
pytest -v --cov --color=yes --cov-report=xml
pytest --cov --color=yes --cov-report=xml
- name: Upload coverage to Codecov
uses: codecov/[email protected]
with:
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -45,3 +45,6 @@ spatialdata-sandbox

# version file
_version.py

# other
node_modules/
8 changes: 4 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,25 +9,25 @@ ci:
skip: []
repos:
- repo: https://github.com/psf/black
rev: 23.10.1
rev: 23.12.1
hooks:
- id: black
- repo: https://github.com/pre-commit/mirrors-prettier
rev: v3.0.3
rev: v4.0.0-alpha.8
hooks:
- id: prettier
- repo: https://github.com/asottile/blacken-docs
rev: 1.16.0
hooks:
- id: blacken-docs
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.6.1
rev: v1.8.0
hooks:
- id: mypy
additional_dependencies: [numpy, types-requests]
exclude: tests/|docs/
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.1.3
rev: v0.1.13
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
2 changes: 1 addition & 1 deletion .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ build:
python: "3.10"
sphinx:
configuration: docs/conf.py
fail_on_warning: false
fail_on_warning: true
python:
install:
- method: pip
Expand Down
50 changes: 49 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,58 @@ and this project adheres to [Semantic Versioning][].
[keep a changelog]: https://keepachangelog.com/en/1.0.0/
[semantic versioning]: https://semver.org/spec/v2.0.0.html

## [0.0.15] - tbd
## [0.1.0] - tbd

### Added

#### Major

- Implemented support in SpatialData for storing multiple tables. These tables
can annotate a SpatialElement but not necessarily so.
- Increased in-memory vs on-disk control: changes performed in-memory (e.g. adding a new image) are not automatically performed on-disk.

#### Minor

- Added public helper function get_table_keys in spatialdata.models to retrieve annotation information of a given table.
- Added public helper function check_target_region_column_symmetry in spatialdata.models to check whether annotation
metadata in table.uns['spatialdata_attrs'] corresponds with respective columns in table.obs.
- Added function validate_table_in_spatialdata in SpatialData to validate the annotation target of a table being
present in the SpatialData object.
- Added function get_annotated_regions in SpatialData to get the regions annotated by a given table.
- Added function get_region_key_column in SpatialData to get the region_key column in table.obs.
- Added function get_instance_key_column in SpatialData to get the instance_key column in table.obs.
- Added function set_table_annotates_spatialelement in SpatialData to either set or change the annotation metadata of
a table in a given SpatialData object.
- Added tables property in SpatialData.
- Added tables setter in SpatialData.
- Added gen_spatial_elements generator in SpatialData to generate the SpatialElements in a given SpatialData object.
- Added gen_elements generator in SpatialData to generate elements of a SpatialData object including tables.

### Changed

#### Minor

- Changed the string representation of SpatialData to reflect the changes in regard to multiple tables.

## [0.0.x] - tbd

### Minor

- improved usability and robustness of sdata.write() when overwrite=True @aeisenbarth

### Added

- added SpatialData.subset() API
- added SpatialData.locate_element() API

### Fixed

- generalized queries to any combination of 2D/3D data and 2D/3D query region #409

#### Minor

- refactored data loader for deep learning

## [0.0.14] - 2023-10-11

### Added
Expand Down
8 changes: 0 additions & 8 deletions docs/_templates/autosummary/class.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,8 @@ Attributes table
~~~~~~~~~~~~~~~~~~

.. autosummary::

{% for item in attributes %}

~{{ fullname }}.{{ item }}

{%- endfor %}
{% endif %}
{% endblock %}
Expand All @@ -27,13 +24,10 @@ Methods table
~~~~~~~~~~~~~

.. autosummary::

{% for item in methods %}

{%- if item != '__init__' %}
~{{ fullname }}.{{ item }}
{%- endif -%}

{%- endfor %}
{% endif %}
{% endblock %}
Expand All @@ -46,7 +40,6 @@ Attributes
{% for item in attributes %}

.. autoattribute:: {{ [objname, item] | join(".") }}

{%- endfor %}

{% endif %}
Expand All @@ -61,7 +54,6 @@ Methods
{%- if item != '__init__' %}

.. automethod:: {{ [objname, item] | join(".") }}

{%- endif -%}
{%- endfor %}

Expand Down
16 changes: 11 additions & 5 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,12 @@ Operations on `SpatialData` objects.
get_extent
match_table_to_element
concatenate
rasterize
transform
rasterize
aggregate
```

### Utilities
### Operations Utilities

```{eval-rst}
.. autosummary::
Expand All @@ -49,6 +49,7 @@ The elements (building-blocks) that consitute `SpatialData`.

```{eval-rst}
.. currentmodule:: spatialdata.models
.. autosummary::
:toctree: generated
Expand All @@ -61,9 +62,11 @@ The elements (building-blocks) that consitute `SpatialData`.
TableModel
```

### Utilities
### Models Utilities

```{eval-rst}
.. currentmodule:: spatialdata.models
.. autosummary::
:toctree: generated
Expand Down Expand Up @@ -94,9 +97,11 @@ The transformations that can be defined between elements and coordinate systems
Sequence
```

### Utilities
### Transformations Utilities

```{eval-rst}
.. currentmodule:: spatialdata.transformations
.. autosummary::
:toctree: generated
Expand All @@ -119,7 +124,7 @@ The transformations that can be defined between elements and coordinate systems
ImageTilesDataset
```

## Input/output
## Input/Output

```{eval-rst}
.. currentmodule:: spatialdata
Expand All @@ -129,4 +134,5 @@ The transformations that can be defined between elements and coordinate systems
read_zarr
save_transformations
get_dask_backing_files
```
1 change: 0 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,6 @@ dev = [
docs = [
"sphinx>=4.5",
"sphinx-book-theme>=1.0.0",
"sphinx_rtd_theme",
"myst-nb",
"sphinxcontrib-bibtex>=1.0.0",
"sphinx-autodoc-typehints",
Expand Down
3 changes: 2 additions & 1 deletion src/spatialdata/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
"read_zarr",
"unpad_raster",
"save_transformations",
"get_dask_backing_files",
]

from spatialdata import dataloader, models, transformations
Expand All @@ -40,6 +41,6 @@
from spatialdata._core.query.relational_query import get_values, match_table_to_element
from spatialdata._core.query.spatial_query import bounding_box_query, polygon_query
from spatialdata._core.spatialdata import SpatialData
from spatialdata._io._utils import save_transformations
from spatialdata._io._utils import get_dask_backing_files, save_transformations
from spatialdata._io.io_zarr import read_zarr
from spatialdata._utils import unpad_raster
116 changes: 116 additions & 0 deletions src/spatialdata/_core/_elements.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
"""SpatialData elements."""
from __future__ import annotations

from collections import UserDict
from collections.abc import Iterable
from typing import Any
from warnings import warn

from anndata import AnnData
from dask.dataframe.core import DataFrame as DaskDataFrame
from datatree import DataTree
from geopandas import GeoDataFrame

from spatialdata._types import Raster_T
from spatialdata._utils import multiscale_spatial_image_from_data_tree
from spatialdata.models import (
Image2DModel,
Image3DModel,
Labels2DModel,
Labels3DModel,
PointsModel,
ShapesModel,
TableModel,
get_axes_names,
get_model,
)


class Elements(UserDict[str, Any]):
def __init__(self, shared_keys: set[str | None]) -> None:
self._shared_keys = shared_keys
super().__init__()

@staticmethod
def _check_key(key: str, element_keys: Iterable[str], shared_keys: set[str | None]) -> None:
if key in element_keys:
warn(f"Key `{key}` already exists. Overwriting it.", UserWarning, stacklevel=2)
else:
if key in shared_keys:
raise KeyError(f"Key `{key}` already exists.")

def __setitem__(self, key: str, value: Any) -> None:
self._shared_keys.add(key)
super().__setitem__(key, value)

def __delitem__(self, key: str) -> None:
self._shared_keys.remove(key)
super().__delitem__(key)


class Images(Elements):
def __setitem__(self, key: str, value: Raster_T) -> None:
self._check_key(key, self.keys(), self._shared_keys)
if isinstance(value, (DataTree)):
value = multiscale_spatial_image_from_data_tree(value)
schema = get_model(value)
if schema not in (Image2DModel, Image3DModel):
raise TypeError(f"Unknown element type with schema: {schema!r}.")
ndim = len(get_axes_names(value))
if ndim == 3:
Image2DModel().validate(value)
super().__setitem__(key, value)
elif ndim == 4:
Image3DModel().validate(value)
super().__setitem__(key, value)
else:
NotImplementedError("TODO: implement for ndim > 4.")


class Labels(Elements):
def __setitem__(self, key: str, value: Raster_T) -> None:
self._check_key(key, self.keys(), self._shared_keys)
if isinstance(value, (DataTree)):
value = multiscale_spatial_image_from_data_tree(value)
schema = get_model(value)
if schema not in (Labels2DModel, Labels3DModel):
raise TypeError(f"Unknown element type with schema: {schema!r}.")
ndim = len(get_axes_names(value))
if ndim == 2:
Labels2DModel().validate(value)
super().__setitem__(key, value)
elif ndim == 3:
Labels3DModel().validate(value)
super().__setitem__(key, value)
else:
NotImplementedError("TODO: implement for ndim > 3.")


class Shapes(Elements):
def __setitem__(self, key: str, value: GeoDataFrame) -> None:
self._check_key(key, self.keys(), self._shared_keys)
schema = get_model(value)
if schema != ShapesModel:
raise TypeError(f"Unknown element type with schema: {schema!r}.")
ShapesModel().validate(value)
super().__setitem__(key, value)


class Points(Elements):
def __setitem__(self, key: str, value: DaskDataFrame) -> None:
self._check_key(key, self.keys(), self._shared_keys)
schema = get_model(value)
if schema != PointsModel:
raise TypeError(f"Unknown element type with schema: {schema!r}.")
PointsModel().validate(value)
super().__setitem__(key, value)


class Tables(Elements):
def __setitem__(self, key: str, value: AnnData) -> None:
self._check_key(key, self.keys(), self._shared_keys)
schema = get_model(value)
if schema != TableModel:
raise TypeError(f"Unknown element type with schema: {schema!r}.")
TableModel().validate(value)
super().__setitem__(key, value)
22 changes: 22 additions & 0 deletions src/spatialdata/_core/_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
from spatialdata._core.spatialdata import SpatialData


def _find_common_table_keys(sdatas: list[SpatialData]) -> set[str]:
"""
Find table keys present in more than one SpatialData object.
Parameters
----------
sdatas
A list of SpatialData objects.
Returns
-------
A set of common keys that are present in the tables of more than one SpatialData object.
"""
common_keys = set(sdatas[0].tables.keys())

for sdata in sdatas[1:]:
common_keys.intersection_update(sdata.tables.keys())

return common_keys
Loading

0 comments on commit 75d66f1

Please sign in to comment.