GitHub - cbclobato/wild-again: Bioinformatic workflow, statistical analysis and figure preparation for the Cannabis seed microbiome MS

Wild again: Recovery of a beneficial Cannabis seed endophyte from low domestication genotypes

Authors: Carolina Lobato, João Machado de Freitas, Daniel Habich, Isabella Kögl, Gabriele Berg & Tomislav Cernava

Institute of Environmental Biotechnology (UBT) — TU Graz

Project structure

project/
├── data/
│   └── metadata/
│       ├── CB1/
│       │   ├── bcpr-fw.fasta
│       │   ├── bcpr-rv.fasta
│       │   ├── cat.sh
│       │   └── manifest.csv
│       ├── CB2/
│       │   ├── bcpr-fw.fasta
│       │   ├── bcpr-rv.fasta
│       │   ├── cat.sh
│       │   └── manifest.csv
│       ├── CB3/
│       │   ├── bcpr-fw.fasta
│       │   ├── bcpr-rv.fasta
│       │   ├── cat.sh
│       │   └── manifest.csv
│       ├── metadata.csv
│       ├── pouches-ind.tsv
│       ├── pouches-exp.tsv
│       ├── field23.tsv
│       └── PLaBase.tsv
├── scripts/
│   ├── qiime2/
│   │   └── bioprocessing_pipeline.sh
│   ├── r/
│   │   ├── Setup.Rmd
│   │   ├── Figure1.Rmd
│   │   ├── Figure2.Rmd
│   │   ├── Figure3.Rmd
│   │   ├── Figure4.Rmd
│   │   └── Figure5.Rmd
│   └── utils/
│       ├── csv2fasta.sh
│       ├── csv2tsv.sh
│       ├── qiime2r.sh
│       ├── install.R
│       ├── plot_composition_v2.R
│       ├── umap/
│       │   ├── data.R
│       │   ├── project.R
│       │   ├── cluster-analysis.R
│       │   └── run-all.R
│       └── biomarkers/
│           ├── data.R
│           ├── feature-importance.R
│           ├── sv-importance-patch.R
│           ├── train-eval.R 
│           └── run-all.R
├── outputs/
│   ├── qiime2/
│   └── r/
├── README.md
└── LICENSE

Details

data/
- metadata/ This subdirectory contains:
  - fasta barcode files for each pool (bcpr-fw.fasta and bcpr-rv.fasta).
  - the concatenating files for each pool (cat.sh).
  - the manifest files for each pool (manifest.csv).
  - the sample information for the metabarcoding analysis (metadata.csv).
  - the metadata and measurements at individual-level in lab trials (pouches-ind.tsv)
  - the metadata and measurments related to each replicate experiment in lab trials (pouches-exp.tsv)
  - the metadata and measurements of the field trials (field23.tsv)
  - the output gene class table from PLaBase (PLaBase.tsv)
scripts/
- qiime2/ This subdirectory contains the pipeline (bioprocessing_pipeline.sh) used for:
  - demultiplexing with CUTADAPT v4.2.
  - importing into QIIME2 v2023.5 and the further bioinformatic processing steps using the DADA2 pipeline and the VSEARCH algorithm using the SILVA v138 reference database, which generated the feature table, taxonomy file, representative sequences and phylogenetic tree.
  - exporting from QIIME2.
- r/ This subdirectory contains the scripts used in R to create the phyloseq objects, preprocess the data, and prepare the figures for the manuscript.
- utils/ This subdirectory contains utility scripts that are used by other scripts in the project, such as:
  - csv2fasta.sh for converting .csv to .fasta format.
  - csv2tsv.sh for converting .csv to .tsv format.
  - qiime2r.sh for converting .biom to .tsv format and back,
  - install.R for installing the necessary packages in R.
  - plot_composition_v2 modified microbiome::plot_composition function for running when the microbiome version is above 1.6.
  - umap/ contains the scripts used for beta diversity representation with UMAP shown in Figure2.
  - biomarkers/ contains the scripts used for biomarker assessment shown in Figure3.
outputs/ contains qiime2 and r saved outputs.

Further content

The 16S rRNA gene amplicon raw FASTQ files were deposited in ENA under the accession number PRJEB64469.

The assembled genome of Bacillus frigotolerans, with the associated annotations, was deposited in NCBI under accession number PRJNA1113337.

References

Lobato, C. et al. Wild again: recovery of a beneficial Cannabis seed endophyte from low domestication genotypes. Microbiome 12, 239 (2024). https://doi.org/10.1186/s40168-024-01951-5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wild again: Recovery of a beneficial Cannabis seed endophyte from low domestication genotypes

Project structure

Details

Further content

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
metadata		metadata
outputs		outputs
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
wild-again.Rproj		wild-again.Rproj

License

cbclobato/wild-again

Folders and files

Latest commit

History

Repository files navigation

Wild again: Recovery of a beneficial Cannabis seed endophyte from low domestication genotypes

Project structure

Details

Further content

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages