marbel (MetAtranscriptomic Reference Builder Evaluation Library)

This project generates an in silico metatranscriptomic dataset based on specified parameters.

Installation

Install guide for development purposes

Install miniconda (if not installed already)

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

bash Miniconda3-latest-Linux-x86_64.sh

Create conda env

conda create -n marbel python=3.10 r-base
conda activate marbel

Install git-lfs (absolutely necessary)

Before cloning the repo you need to have git-lfs installed! If you do not have git-lfs and root rights install with

sudo apt-get install git-lfs

If you do not have root permission, install it in the Conda env:

conda install anaconda::git-lfs

Now we need to initialize Git LFS:

git lfs install

If you already cloned the repo, remove it, install git-lfs and clone again.

Instal g++ (Optional, for performance)

sudo apt-get install g++

Clone repository

git clone https://github.com/jlab/marbel.git

Install the package:

cd marbel
pip install -e .

(Not ready, this is for later) nda build and install

It is recomended to install the package with conda install.

Build the package with:

conda build .

For this you need to have conda-build installed (conda install conda-build)

Create new environment and install package:

conda create -n marbel
conda activate marbel
conda install --use-local marbel

Usage

To get help on how to use the script, run:

marbel --help

Command Line Arguments

# Usage: marbel [OPTIONS]

## Options:
- `--n-species` **INTEGER**  
  Number of species to be drawn for the metatranscriptomic in silico dataset.  
  **[default: 20]**

- `--n-orthogroups` **INTEGER**  
  Number of orthologous groups to be drawn for the metatranscriptomic in silico dataset.  
  **[default: 1000]**

- `--n-samples` **<INTEGER INTEGER>...**  
  Number of samples to be created for the metatranscriptomic in silico dataset. The first number represents the number of samples for group 1, and the second is for group 2.  
  **[default: 10, 10]**

- `--outdir` **TEXT**  
  Output directory for the metatranscriptomic in silico dataset.  
  **[default: simulated_reads]**

- `--max-phylo-distance` **[phylum|class|order|family|genus]**  
  Maximum mean phylogenetic distance for orthologous groups. Specify a stricter limit to avoid groups with a more diverse phylogenetic distance.  
  **[default: None]**

- `--min-identity` **FLOAT**  
  Minimum mean sequence identity score for orthologous groups. Specify for more stringent identity requirements.  
  **[default: None]**

- `--dge-ratio` **FLOAT**  
  Ratio of up- and down-regulated genes. The first value is the ratio of up-regulated genes, and the second represents the ratio of down-regulated genes.  
  **[default: 0.1]**

- `--seed` **INTEGER**  
  Seed for sampling. Set for reproducibility.  
  **[default: None]**

- `--error-model` **[basic|perfect|HiSeq|NextSeq|NovaSeq|Miseq-20|Miseq-24|Miseq-28|Miseq-32]**  
  Sequencer model for the reads. Use `basic` or `perfect` (no errors) for custom read length.  
  **[default: HiSeq]**

- `--compressed / --no-compressed`  
  Compress the output FASTQ files.  
  **[default: compressed]**

- `--read-length` **INTEGER**  
  Read length for the generated reads. Only available when using `error_model` basic or perfect.  
  **[default: None]**

- `--library-size` **INTEGER**  
  Library size for the reads.  
  **[default: 100000]**

- `--library-size-distribution` **[poisson|uniform|negative_binomial]**  
  Distribution for the library size.  
  **[default: uniform]**

- `--threads` **INTEGER**  
  Number of threads to be used.  
  **[default: 10]**

- `--version`  
  Show the version and exit.

- `--help`  
  Show this message and exit.

Examples

Running with Default Parameters

marbel

Specifying Number of Species, Orthogroups, and Samples

marbel --n-species 10 --n-orthogroups 500 --n-samples 5 8

This command will generate a dataset with:

10 species
500 orthologous groups
5 samples for group 1
8 samples for group 2

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any changes.

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Feel free to reach out if you have any questions or need further assistance with the usage of the tool.

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
.github/workflows		.github/workflows
src/marbel		src/marbel
summary		summary
tests		tests
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
build.sh		build.sh
environment.yml		environment.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

marbel (MetAtranscriptomic Reference Builder Evaluation Library)

Installation

Install guide for development purposes

Install miniconda (if not installed already)

Create conda env

Install git-lfs (absolutely necessary)

Instal g++ (Optional, for performance)

Clone repository

Install the package:

(Not ready, this is for later) nda build and install

Usage

Command Line Arguments

Examples

Running with Default Parameters

Specifying Number of Species, Orthogroups, and Samples

Contributing

License

About

Releases 3

Packages

Contributors 2

Languages

License

jlab/marbel

Folders and files

Latest commit

History

Repository files navigation

marbel (MetAtranscriptomic Reference Builder Evaluation Library)

Installation

Install guide for development purposes

Install miniconda (if not installed already)

Create conda env

Install git-lfs (absolutely necessary)

Instal g++ (Optional, for performance)

Clone repository

Install the package:

(Not ready, this is for later) nda build and install

Usage

Command Line Arguments

Examples

Running with Default Parameters

Specifying Number of Species, Orthogroups, and Samples

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

Packages