Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add user guide #690

Merged
merged 31 commits into from
Jun 25, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
ee004a4
user guide
Apr 30, 2024
fd8e52a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 30, 2024
a88e2ba
module and toctree
Apr 30, 2024
a262829
Merge branch 'add/user_guide' of https://github.com/theislab/moscot i…
Apr 30, 2024
7939851
Merge branch 'main' into add/user_guide
ArinaDanilina Apr 30, 2024
e755c47
Merge branch 'main' into add/user_guide
giovp Apr 30, 2024
82a2bb2
:mod: and header anchor
May 5, 2024
7e78934
module::
May 5, 2024
a9c2020
typos and links
May 7, 2024
fdbdf2e
Merge branch 'main' into add/user_guide
ArinaDanilina May 13, 2024
4f1c8db
tables
May 13, 2024
1d9045e
hyperparameters
May 13, 2024
361324a
typo
May 13, 2024
4c2ca76
OT link
May 13, 2024
684243f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 13, 2024
ac60f7d
Merge branch 'main' into add/user_guide
ArinaDanilina May 13, 2024
1f95b04
Merge branch 'main' into add/user_guide
ArinaDanilina May 16, 2024
6b4914c
Merge branch 'main' into add/user_guide
ArinaDanilina May 27, 2024
7242750
edits and links
May 27, 2024
9970295
general reference to examples / tutorials
May 27, 2024
8ab44e4
Merge branch 'main' into add/user_guide
ArinaDanilina May 28, 2024
ee4075d
edits
Jun 6, 2024
c0c2397
class link and line breaks
Jun 6, 2024
a98b0b8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 6, 2024
21e8930
linebreaks
Jun 6, 2024
4be5839
Merge branch 'main' into add/user_guide
ArinaDanilina Jun 6, 2024
d280ad8
linebreaks
Jun 6, 2024
7ac2d9c
Merge branch 'main' into add/user_guide
ArinaDanilina Jun 13, 2024
60188f2
remove link
Jun 21, 2024
19c31a8
Merge branch 'main' into add/user_guide
ArinaDanilina Jun 21, 2024
ba6c294
edits
Jun 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@
"dollarmath",
"amsmath",
]
myst_heading_anchors = 2
myst_heading_anchors = 3


# autodoc + napoleon
Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ well as the publication introducing the model, which can be found in the corresp
:hidden:

installation
user_guide
user
developer
contributing
Expand Down
108 changes: 108 additions & 0 deletions docs/user_guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# User guide

moscot is a toolbox which can solve a wide range of tasks in single-cell genomics building upon the concept of [Optimal Transport (OT)](<https://en.wikipedia.org/wiki/Transportation_theory_(mathematics)>).

moscot builds upon three principles:

- moscot applications are scalable. While traditional OT implementations are computationally expensive, moscot implements a wide range of solvers which can handle hunreds of thousands of cells.
- moscot supports OT applications across [multiple modalities](#multimodality)
- moscot offers a unified [user interface](#user-interface) and provides flexible implementations

## Problems

### Biological problems

#### Temporal data

```{eval-rst}
.. list-table::
:widths: 15 100
:header-rows: 1

* - Problem
- Description
* - :mod:`moscot.problems.time.TemporalProblem`
- Class for analyzing time-series single cell data based on :cite:`schiebinger:19`.
* - :mod:`moscot.problems.time.LineageProblem`
- Estimator for modelling time series single cell data based on :cite:`lange-moslin:23`.
```

#### Spatial data

```{eval-rst}
.. list-table::
:widths: 15 100
:header-rows: 1

* - Problem
- Description
* - :mod:`moscot.problems.space.AlignmentProblem`
- Class for aligning spatial omics data, based on :cite:`zeira:22`.
* - :mod:`moscot.problems.space.MappingProblem`
- Class for mapping single cell omics data onto spatial data, based on :cite:`nitzan:19`.
```

#### Spatiotemporal data

```{eval-rst}
.. list-table::
:widths: 15 100
:header-rows: 1

* - Problem
- Description
* - :mod:`moscot.problems.cross_modality.TranslationProblem`
ArinaDanilina marked this conversation as resolved.
Show resolved Hide resolved
- Class for analyzing time series spatial single-cell data.
```

#### Multimodal data

```{eval-rst}
.. list-table::
:widths: 15 100
:header-rows: 1

* - Problem
- Description
* - :mod:`moscot.problems.spatiotemporal.SpatioTemporalProblem`
ArinaDanilina marked this conversation as resolved.
Show resolved Hide resolved
- Class for integrating single-cell multi-omics data, based on :cite:`demetci-scot:22`.
```

### Generic problems

```{eval-rst}
.. list-table::
:widths: 15 100
:header-rows: 1

* - Problem
- Description
* - :mod:`moscot.problems.generic.SinkhornProblem`
- Class for solving a :term:`linear problem`.
* - :mod:`moscot.problems.generic.GWProblem`
- Class for solving a :term:`Gromov-Wasserstein` problem.
* - :mod:`moscot.problems.generic.FGWProblem`
- Class for solving a :term:`fused Gromov-Wasserstein` problem.
```

## Scalability

In their original formulation, OT algorithms don't scale to large datasets du to their high computational complexity. Moscot overcomes this limitation by allowing for the use of low-rank solvers. in each `solve` method we have the rank parameter, by default $-1$. Whenever possible,it's best to start with the full rank, but when needed, the rank should be set to a positive integer. The higher the rank, the better the full-rank approximation. Hence, one should start with a reasonable high rank, e.g. $5000$. Consecutively decrease the rank if needed due to memory constraints. Note that the scale of $\tau_a$ and $\tau_b$ change whenever we are in the low-rank setting. while they should be still between $0$ and $1$, empirically they should be set in the range between $0.1$ and $0.5$. See [below](#hyperparameters) for a more detailed discussion and {doc}`/notebooks/examples/solvers/100_linear_problems_basic` and {doc}`/notebooks/examples/solvers/300_quad_problems_basic` on how to use low-rank solutions.
ArinaDanilina marked this conversation as resolved.
Show resolved Hide resolved

## Multimodality

All moscot problems are in general applicable to any modality, as the solution of the moscot problem only depends on pairwise distances of cells. Yet, it is up to the users to apply the preprocessing. We recommend using embeddings, e.g. [scVI-tools](https://docs.scvi-tools.org/en/stable/index.html) based or linear embeddings of dimension $10-100$. On how to pass certain embeddings please have a look at {doc}`/notebooks/tutorials/600_tutorial_translation`.
When working with multiple modalities, we can construct a joint space, e.g. by using VAEs incorporating multiple modalities ([MultiVI](https://docs.scvi-tools.org/en/stable/user_guide/models/multivi.html)), or by concatenating linear embeddings (e.g. concatenate PCA and LSI space of GEX and ATAC, respectively)

## User interface

moscot problems implement problem-specific downstream methods, so we recommend to use task-specific moscot problems. Yet, we also offer [generic solvers](#generic-problems) with a limited range of downstream applications for more advanced users, which allow for more flexiblity.

## Hyperparameters

moscot problems' `solve` methods have the following parameters that can be set depending on the specific task:
ArinaDanilina marked this conversation as resolved.
Show resolved Hide resolved

- $\alpha$ - Parameter in $(0, 1]$ that interpolates between the {term}`quadratic term` and the {term}`linear term`. $\alpha = 1$ corresponds to the pure {term}`Gromov-Wasserstein` problem while $\alpha \to 0$ corresponds to the pure {term}`linear problem`.
ArinaDanilina marked this conversation as resolved.
Show resolved Hide resolved
- $\tau_a$ and $\tau_b$ - Parameters in $(0, 1]$ that define how {term}`unbalanced <unbalanced OT problem>` is the problem on the source and target {term}`marginals`. If $1$, the problem is {term}`balanced <balanced OT problem>`.
- $\varepsilon$ - {term}`Entropic regularization`.
- `rank` - Rank of the {term}`low-rank OT` solver {cite}`scetbon:21b`. If $-1$, full-rank solver {cite}`peyre:2016` is used.
Loading