Merge pull request #194 from vib-singlecell-nf/develop

Develop Former-commit-id: 17c89c9
vib-singlecell-nf · Apr 3, 2020 · 146cfc3 · 146cfc3
2 parents 66bd074 + 1f41dc1
commit 146cfc3
Show file tree

Hide file tree

Showing 17 changed files with 717 additions and 217 deletions.
diff --git a/README.rst b/README.rst
@@ -1,7 +1,109 @@
 VSN-Pipelines
 ==============
 
-|Nextflow| |Gitter| |ReadTheDocs| |Zenodo|
+A repository of pipelines for single-cell data analysis in Nextflow DSL2.
+
+|VSN-Pipelines| |ReadTheDocs| |Zenodo| |Gitter| |Nextflow|
+
+
+**Full documentation** is available on `Read the Docs <https://vsn-pipelines.readthedocs.io/en/latest/>`_, or take a look at the `Quick Start <https://vsn-pipelines.readthedocs.io/en/latest/getting-started.html#quick-start>`_ guide.
+
+This main repo contains multiple workflows for analyzing single cell transcriptomics data, and depends on a number of tools, which are organized into submodules within the VIB-Singlecell-NF_ organization.
+Currently available workflows are listed below.
+
+Raw Data Processing Workflows
+-----------------------------
+
+These are set up to run Cell Ranger and DropSeq pipelines.
+
+.. list-table:: Raw Data Processing Workflows
+    :widths: 15 10 30
+    :header-rows: 1
+
+    * - Pipeline / Entrypoint
+      - Purpose
+      - Documentation
+    * - cellranger
+      - Process 10x Chromium data
+      - cellranger_
+    * - demuxlet_freemuxlet
+      - Demultiplexing
+      - demuxlet_freemuxlet_
+    * - nemesh
+      - Process Drop-seq data
+      - nemesh_
+
+.. _cellranger: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#cellranger
+.. _demuxlet_freemuxlet: https://vsn-pipelines.readthedocs.io/en/develop/pipelines.html#demuxlet-freemuxlet
+.. _nemesh: https://vsn-pipelines.readthedocs.io/en/develop/pipelines.html#nemesh
+
+
+Single Sample Workflows
+-----------------------
+
+The **Single Sample Workflows** perform a "best practices" scRNA-seq analysis. Multiple samples can be run in parallel, treating each sample separately.
+
+.. list-table:: Single Sample Workflows
+    :header-rows: 1
+
+    * - Pipeline / Entrypoint
+      - Purpose
+      - Documentation
+    * - single_sample
+      - Independent samples
+      - |single_sample|
+    * - single_sample_scenic
+      - Ind. samples + SCENIC
+      - |single_sample_scenic|
+    * - scenic
+      - SCENIC GRN inference
+      - |scenic|
+    * - scenic_multiruns
+      - SCENIC run multiple times
+      - |scenic_multiruns|
+    * - single_sample_scenic_multiruns
+      - Ind. samples + multi-SCENIC
+      - |single_sample_scenic_multiruns|
+
+
+Sample Aggregation Workflows
+----------------------------
+
+**Sample Aggregation Workflows**: perform a "best practices" scRNA-seq analysis on a merged and batch-corrected group of samples. Available batch correction methods include BBKNN, mnnCorrect, and Harmony.
+
+.. list-table:: Sample Aggregation Pipelines
+    :widths: 15 10 30
+    :header-rows: 1
+
+    * - Pipeline / Entrypoint
+      - Purpose
+      - Documentation
+    * - bbknn
+      - Sample aggregation + BBKNN
+      - |bbknn|
+    * - bbknn_scenic
+      - BBKNN + SCENIC
+      - |bbknn_scenic|
+    * - harmony
+      - Sample aggregation + Harmony
+      - |harmony|
+    * - mnncorrect
+      - Sample aggregation + mnnCorrect
+      - |mnncorrect|
+
+
+In addition, the pySCENIC_ implementation of the SCENIC_ workflow is integrated here and can be run in conjunction with any of the above workflows.
+The output of each of the main workflows is a loom_-format file, which is ready for import into the interactive single-cell web visualization tool SCope_.
+In addition, data is also output in h5ad format, and reports are generated for the major pipeline steps.
+
+If VSN-Pipelines is useful for your research, consider citing:
+
+- VSN-Pipelines All Versions (latest): `10.5281/zenodo.3703108 <https://doi.org/10.5281/zenodo.3703108>`_.
+
+
+.. |VSN-Pipelines| image:: https://img.shields.io/github/v/release/vib-singlecell-nf/vsn-pipelines
+    :target: https://github.com/vib-singlecell-nf/vsn-pipelines/releases
+    :alt: GitHub release (latest by date)
 
 .. |ReadTheDocs| image:: https://readthedocs.org/projects/vsn-pipelines/badge/?version=latest
     :target: https://vsn-pipelines.readthedocs.io/en/latest/?badge=latest
@@ -19,8 +121,11 @@ VSN-Pipelines
     :target: https://zenodo.org/badge/latestdoi/199477571
     :alt: Zenodo
 
-
-|single_sample| |single_sample_scenic| |scenic| |scenic_multiruns| |single_sample_scenic_multiruns| |bbknn| |bbknn_scenic| |harmony| |mnncorrect|
+.. _VIB-Singlecell-NF: https://github.com/vib-singlecell-nf
+.. _pySCENIC: https://github.com/aertslab/pySCENIC
+.. _SCENIC: https://aertslab.org/#scenic
+.. _loom: http://loompy.org/
+.. _SCope: http://scope.aertslab.org/
 
 .. |single_sample| image:: https://github.com/vib-singlecell-nf/vsn-pipelines/workflows/single_sample/badge.svg
     :target: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#single-sample-single-sample
@@ -58,39 +163,3 @@ VSN-Pipelines
     :target: https://vsn-pipelines.readthedocs.io/en/latest/pipelines.html#mnncorrect-mnncorrect
     :alt: MNN-correct Pipeline
 
-A repository of pipelines for single-cell data in Nextflow DSL2.
-
-A quick tour of the VSN pipelines ? Please read `Quick Start <https://vsn-pipelines.readthedocs.io/en/latest/getting-started.html#quick-start>`_.
-
-Full documentation available on `Read the Docs <https://vsn-pipelines.readthedocs.io/en/latest/>`_
-
-If VSN-Pipelines is useful for your research, consider citing:
-
-- VSN-Pipelines All Versions (latest): `10.5281/zenodo.3703108 <https://doi.org/10.5281/zenodo.3703108>`_.
-
-This main repo contains multiple workflows for analyzing single cell transcriptomics data, and depends on a number of tools, which are organized into submodules within the VIB-Singlecell-NF_ organization.
-Currently available workflows include:
-
-.. _VIB-Singlecell-NF: https://github.com/vib-singlecell-nf
-
-- **Cell Ranger**: processes 10x Chromium data to align reads to generate an expression counts matrix.
-- **DropSeq**: processes Drop-seq data from read alignment to expression counts.
-- **Single sample workflows**: perform a "best practices" scRNA-seq analysis. Multiple samples can be run in parallel, treating each sample separately.
-- **Multi-sample workflows**: perform a "best practices" scRNA-seq analysis on a merged and batch-corrected group of samples. Available batch correction methods include:
-
-    - **BBKNN**
-    - **mnnCorrect**
-    - **Harmony**
-
-* **GRN inference**:
-
-    * The pySCENIC_ implementation of the SCENIC_ workflow is integrated here and can be run in conjunction with any of the above workflows.
-
-.. _pySCENIC: https://github.com/aertslab/pySCENIC
-.. _SCENIC: https://aertslab.org/#scenic
-
-The output of each of the main workflows is a loom_-format file, which is ready for import into the interactive single-cell web visualization tool SCope_.
-In addition, data is also output in h5ad format, and reports are generated for the major pipeline steps.
-
-.. _loom: http://loompy.org/
-.. _SCope: http://scope.aertslab.org/
diff --git a/conf/generic.config b/conf/generic.config
@@ -1,20 +1,22 @@
 params {
-    // This closure facilitates the usage of sample specific parameters
+   // This closure facilitates the usage of sample specific parameters
    parseConfig = { sample, paramsGlobal, paramsLocal ->
-        def lv = { a,b -> return MethodRankHelper.delDistance(a, b) }
-        def pL = paramsLocal.collectEntries { k,v ->
-           if (v instanceof Map) {
-                if (v.containsKey(sample))
-                   return [k, v[sample]]
-                if (v.containsKey('default'))
-                   return [k, v['default']]
-                if (lv(k,sample) > 30)
-                   return [k,v]
-                throw new Exception("Not a valid entry in " + k + ". The sample " + sample + " is not found in " + v +" ; Make sure your samples are correctly specified when using the multi-sample feature.")
-           } else {
+         def lv = { a,b -> return org.codehaus.groovy.runtime.MethodRankHelper.delDistance(a, b) }
+         def pL = paramsLocal.collectEntries { k,v ->
+            if (v instanceof Map) {
+               if (v.containsKey(sample))
+                  return [k, v[sample]]
+               if (v.containsKey('default'))
+                  return [k, v['default']]
+               def closeMatches = v.collectEntries { vk, vv -> [lv(vk, sample), vk] }.keySet().findAll { it < 30}
+               if(closeMatches.size() > 0)
+                  throw new Exception("The sample " + sample + " is not found in " + v +" ; Make sure your samples are correctly specified when using the multi-sample feature.")
+               else
+                  return [k,v]
+            } else {
                return [k,v]
-           }
-       }
-       return [global: paramsGlobal, local: pL]
+         }
+      }
+      return [global: paramsGlobal, local: pL]
    }
 }
diff --git a/docs/case-studies.rst b/docs/case-studies.rst
@@ -1,49 +1,5 @@
 Case Studies
 =============
 
-
-Kurmangaliyev et al., 2019
---------------------------
-
-A single cell analysis of transcriptional control of neuronal connectivity in Drosophila,
-based on `Kurmangaliyev et al., 2019 <https://elifesciences.org/articles/50822>`_.
-
-.. note:: `Full tutorial here <https://vsn-pipelines-examples.readthedocs.io/en/latest/Kurmangaliyev.html>`_.
-
-This case study illustrates the following steps:
-
-1. **Input data** is loaded directly from the `Sequence Read Archive (SRA) <https://www.ncbi.nlm.nih.gov/sra>`_ by giving an SRA identifier to the ``sra`` input channel.
-2. Cell Ranger is run to generate expression counts
-3. Multiple samples are combined, and **batch effect correction** is performed with both BBKNN and Harmony (in separate pipeline runs).
-4. **Gene regulatory network inference** is performed using the SCENIC pipeline. The SCENIC append mode is used to include the SCENIC results with both independent batch effect correction methods, to avoid re-running SCENIC.
-
-
-Hung et al., 2019
------------------
-
-A single cell analysis of the adult Drosophila midgut, based on
-`Hung et al., 2019 <https://vsn-pipelines-examples.readthedocs.io/en/latest/PBMC10k.html>`_.
-
-.. note:: `Full tutorial here <https://vsn-pipelines-examples.readthedocs.io/en/latest/Hung.html>`_.
-
-This case study illustrates the following steps:
-
-1. **Input data** is loaded directly from the `Sequence Read Archive (SRA) <https://www.ncbi.nlm.nih.gov/sra>`_ by giving an SRA identifier to the ``sra`` input channel.
-2. Cell Ranger is run to generate expression counts
-3. Multiple samples are combined, and **batch effect correction** is performed with BBKNN
-4. **Gene regulatory network inference** is performed using the SCENIC pipeline.
-
-
-PBMC10k
--------
-
-An analysis of a sample dataset from 10x Genomics consisting of 10,000 PBMCs from a healthy human donor.
-
-.. note:: `Full tutorial here <https://vsn-pipelines-examples.readthedocs.io/en/latest/PBMC10k.html>`_.
-
-This case study illustrates the following steps:
-
-1. **Input data** is filtered Cell Ranger counts downloaded from the 10x Genomics support website.
-2. The single sample is run through the standard ``single_sample`` pipeline.
-3. **Gene regulatory network inference** is performed using the SCENIC pipeline and integrated with the highly variable genes analysis.
+See the full list of case studies and examples at `VSN-Pipelines-examples <https://vsn-pipelines-examples.readthedocs.io/en/latest/>`_.