Skip to content
This repository has been archived by the owner on Apr 19, 2023. It is now read-only.

Commit

Permalink
Merge pull request #194 from vib-singlecell-nf/develop
Browse files Browse the repository at this point in the history
Develop

Former-commit-id: 17c89c9
  • Loading branch information
dweemx authored Apr 3, 2020
2 parents 66bd074 + 1f41dc1 commit 146cfc3
Show file tree
Hide file tree
Showing 17 changed files with 717 additions and 217 deletions.
147 changes: 108 additions & 39 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,109 @@
VSN-Pipelines
==============

|Nextflow| |Gitter| |ReadTheDocs| |Zenodo|
A repository of pipelines for single-cell data analysis in Nextflow DSL2.

|VSN-Pipelines| |ReadTheDocs| |Zenodo| |Gitter| |Nextflow|


**Full documentation** is available on `Read the Docs <https://vsn-pipelines.readthedocs.io/en/latest/>`_, or take a look at the `Quick Start <https://vsn-pipelines.readthedocs.io/en/latest/getting-started.html#quick-start>`_ guide.

This main repo contains multiple workflows for analyzing single cell transcriptomics data, and depends on a number of tools, which are organized into submodules within the VIB-Singlecell-NF_ organization.
Currently available workflows are listed below.

Raw Data Processing Workflows
-----------------------------

These are set up to run Cell Ranger and DropSeq pipelines.

.. list-table:: Raw Data Processing Workflows
:widths: 15 10 30
:header-rows: 1

* - Pipeline / Entrypoint
- Purpose
- Documentation
* - cellranger
- Process 10x Chromium data
- cellranger_
* - demuxlet_freemuxlet
- Demultiplexing
- demuxlet_freemuxlet_
* - nemesh
- Process Drop-seq data
- nemesh_

.. _cellranger: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#cellranger
.. _demuxlet_freemuxlet: https://vsn-pipelines.readthedocs.io/en/develop/pipelines.html#demuxlet-freemuxlet
.. _nemesh: https://vsn-pipelines.readthedocs.io/en/develop/pipelines.html#nemesh


Single Sample Workflows
-----------------------

The **Single Sample Workflows** perform a "best practices" scRNA-seq analysis. Multiple samples can be run in parallel, treating each sample separately.

.. list-table:: Single Sample Workflows
:header-rows: 1

* - Pipeline / Entrypoint
- Purpose
- Documentation
* - single_sample
- Independent samples
- |single_sample|
* - single_sample_scenic
- Ind. samples + SCENIC
- |single_sample_scenic|
* - scenic
- SCENIC GRN inference
- |scenic|
* - scenic_multiruns
- SCENIC run multiple times
- |scenic_multiruns|
* - single_sample_scenic_multiruns
- Ind. samples + multi-SCENIC
- |single_sample_scenic_multiruns|


Sample Aggregation Workflows
----------------------------

**Sample Aggregation Workflows**: perform a "best practices" scRNA-seq analysis on a merged and batch-corrected group of samples. Available batch correction methods include BBKNN, mnnCorrect, and Harmony.

.. list-table:: Sample Aggregation Pipelines
:widths: 15 10 30
:header-rows: 1

* - Pipeline / Entrypoint
- Purpose
- Documentation
* - bbknn
- Sample aggregation + BBKNN
- |bbknn|
* - bbknn_scenic
- BBKNN + SCENIC
- |bbknn_scenic|
* - harmony
- Sample aggregation + Harmony
- |harmony|
* - mnncorrect
- Sample aggregation + mnnCorrect
- |mnncorrect|


In addition, the pySCENIC_ implementation of the SCENIC_ workflow is integrated here and can be run in conjunction with any of the above workflows.
The output of each of the main workflows is a loom_-format file, which is ready for import into the interactive single-cell web visualization tool SCope_.
In addition, data is also output in h5ad format, and reports are generated for the major pipeline steps.

If VSN-Pipelines is useful for your research, consider citing:

- VSN-Pipelines All Versions (latest): `10.5281/zenodo.3703108 <https://doi.org/10.5281/zenodo.3703108>`_.


.. |VSN-Pipelines| image:: https://img.shields.io/github/v/release/vib-singlecell-nf/vsn-pipelines
:target: https://github.com/vib-singlecell-nf/vsn-pipelines/releases
:alt: GitHub release (latest by date)

.. |ReadTheDocs| image:: https://readthedocs.org/projects/vsn-pipelines/badge/?version=latest
:target: https://vsn-pipelines.readthedocs.io/en/latest/?badge=latest
Expand All @@ -19,8 +121,11 @@ VSN-Pipelines
:target: https://zenodo.org/badge/latestdoi/199477571
:alt: Zenodo


|single_sample| |single_sample_scenic| |scenic| |scenic_multiruns| |single_sample_scenic_multiruns| |bbknn| |bbknn_scenic| |harmony| |mnncorrect|
.. _VIB-Singlecell-NF: https://github.com/vib-singlecell-nf
.. _pySCENIC: https://github.com/aertslab/pySCENIC
.. _SCENIC: https://aertslab.org/#scenic
.. _loom: http://loompy.org/
.. _SCope: http://scope.aertslab.org/

.. |single_sample| image:: https://github.com/vib-singlecell-nf/vsn-pipelines/workflows/single_sample/badge.svg
:target: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#single-sample-single-sample
Expand Down Expand Up @@ -58,39 +163,3 @@ VSN-Pipelines
:target: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#mnncorrect-mnncorrect
:alt: MNN-correct Pipeline

A repository of pipelines for single-cell data in Nextflow DSL2.

A quick tour of the VSN pipelines ? Please read `Quick Start <https://vsn-pipelines.readthedocs.io/en/latest/getting-started.html#quick-start>`_.

Full documentation available on `Read the Docs <https://vsn-pipelines.readthedocs.io/en/latest/>`_

If VSN-Pipelines is useful for your research, consider citing:

- VSN-Pipelines All Versions (latest): `10.5281/zenodo.3703108 <https://doi.org/10.5281/zenodo.3703108>`_.

This main repo contains multiple workflows for analyzing single cell transcriptomics data, and depends on a number of tools, which are organized into submodules within the VIB-Singlecell-NF_ organization.
Currently available workflows include:

.. _VIB-Singlecell-NF: https://github.com/vib-singlecell-nf

- **Cell Ranger**: processes 10x Chromium data to align reads to generate an expression counts matrix.
- **DropSeq**: processes Drop-seq data from read alignment to expression counts.
- **Single sample workflows**: perform a "best practices" scRNA-seq analysis. Multiple samples can be run in parallel, treating each sample separately.
- **Multi-sample workflows**: perform a "best practices" scRNA-seq analysis on a merged and batch-corrected group of samples. Available batch correction methods include:

- **BBKNN**
- **mnnCorrect**
- **Harmony**

* **GRN inference**:

* The pySCENIC_ implementation of the SCENIC_ workflow is integrated here and can be run in conjunction with any of the above workflows.

.. _pySCENIC: https://github.com/aertslab/pySCENIC
.. _SCENIC: https://aertslab.org/#scenic

The output of each of the main workflows is a loom_-format file, which is ready for import into the interactive single-cell web visualization tool SCope_.
In addition, data is also output in h5ad format, and reports are generated for the major pipeline steps.

.. _loom: http://loompy.org/
.. _SCope: http://scope.aertslab.org/
32 changes: 17 additions & 15 deletions conf/generic.config
Original file line number Diff line number Diff line change
@@ -1,20 +1,22 @@
params {
// This closure facilitates the usage of sample specific parameters
// This closure facilitates the usage of sample specific parameters
parseConfig = { sample, paramsGlobal, paramsLocal ->
def lv = { a,b -> return MethodRankHelper.delDistance(a, b) }
def pL = paramsLocal.collectEntries { k,v ->
if (v instanceof Map) {
if (v.containsKey(sample))
return [k, v[sample]]
if (v.containsKey('default'))
return [k, v['default']]
if (lv(k,sample) > 30)
return [k,v]
throw new Exception("Not a valid entry in " + k + ". The sample " + sample + " is not found in " + v +" ; Make sure your samples are correctly specified when using the multi-sample feature.")
} else {
def lv = { a,b -> return org.codehaus.groovy.runtime.MethodRankHelper.delDistance(a, b) }
def pL = paramsLocal.collectEntries { k,v ->
if (v instanceof Map) {
if (v.containsKey(sample))
return [k, v[sample]]
if (v.containsKey('default'))
return [k, v['default']]
def closeMatches = v.collectEntries { vk, vv -> [lv(vk, sample), vk] }.keySet().findAll { it < 30}
if(closeMatches.size() > 0)
throw new Exception("The sample " + sample + " is not found in " + v +" ; Make sure your samples are correctly specified when using the multi-sample feature.")
else
return [k,v]
} else {
return [k,v]
}
}
return [global: paramsGlobal, local: pL]
}
}
return [global: paramsGlobal, local: pL]
}
}
46 changes: 1 addition & 45 deletions docs/case-studies.rst
Original file line number Diff line number Diff line change
@@ -1,49 +1,5 @@
Case Studies
=============


Kurmangaliyev et al., 2019
--------------------------

A single cell analysis of transcriptional control of neuronal connectivity in Drosophila,
based on `Kurmangaliyev et al., 2019 <https://elifesciences.org/articles/50822>`_.

.. note:: `Full tutorial here <https://vsn-pipelines-examples.readthedocs.io/en/latest/Kurmangaliyev.html>`_.

This case study illustrates the following steps:

1. **Input data** is loaded directly from the `Sequence Read Archive (SRA) <https://www.ncbi.nlm.nih.gov/sra>`_ by giving an SRA identifier to the ``sra`` input channel.
2. Cell Ranger is run to generate expression counts
3. Multiple samples are combined, and **batch effect correction** is performed with both BBKNN and Harmony (in separate pipeline runs).
4. **Gene regulatory network inference** is performed using the SCENIC pipeline. The SCENIC append mode is used to include the SCENIC results with both independent batch effect correction methods, to avoid re-running SCENIC.


Hung et al., 2019
-----------------

A single cell analysis of the adult Drosophila midgut, based on
`Hung et al., 2019 <https://vsn-pipelines-examples.readthedocs.io/en/latest/PBMC10k.html>`_.

.. note:: `Full tutorial here <https://vsn-pipelines-examples.readthedocs.io/en/latest/Hung.html>`_.

This case study illustrates the following steps:

1. **Input data** is loaded directly from the `Sequence Read Archive (SRA) <https://www.ncbi.nlm.nih.gov/sra>`_ by giving an SRA identifier to the ``sra`` input channel.
2. Cell Ranger is run to generate expression counts
3. Multiple samples are combined, and **batch effect correction** is performed with BBKNN
4. **Gene regulatory network inference** is performed using the SCENIC pipeline.


PBMC10k
-------

An analysis of a sample dataset from 10x Genomics consisting of 10,000 PBMCs from a healthy human donor.

.. note:: `Full tutorial here <https://vsn-pipelines-examples.readthedocs.io/en/latest/PBMC10k.html>`_.

This case study illustrates the following steps:

1. **Input data** is filtered Cell Ranger counts downloaded from the 10x Genomics support website.
2. The single sample is run through the standard ``single_sample`` pipeline.
3. **Gene regulatory network inference** is performed using the SCENIC pipeline and integrated with the highly variable genes analysis.
See the full list of case studies and examples at `VSN-Pipelines-examples <https://vsn-pipelines-examples.readthedocs.io/en/latest/>`_.

Loading

0 comments on commit 146cfc3

Please sign in to comment.