Skip to content

Road Map

gitliver edited this page May 2, 2016 · 29 revisions

Short-Term Upgrades:

  • graceful exiting in case star, bowtie2, or trinity crash (lack of memory for example)

  • count gene features in the host map bam file for pathway quantification - e.g., innate immunity, hypoxia. this requires a custom GTF file be added to resources/ and a software dependency for the featureCounts program

  • open question of how to later analyze orfs_noblastp.fasta (perhaps k-mer composition)

  • improve logging and graceful exits: write error to stderr (not stdout), and echo step in the error message (so know where error occurred); determine how to deal with w constituent programs that echo a lot of verbiage to stdout or stderr (do we want this stuff in the main logs, sub-logs, etc.?)

  • make a testing framework with sample files, so code can be changed w/o fear of error

  • get the machine learning part working: design a panel of RNA (transcriptome) pathogen dilutions (and then calibrate sensitivity etc)

  • make SGE mem flags options or configurable via the config file

Multi-Sample Aggregation:

  • "pandora aggregate" will condense the information from individual runs into a single table with a fixed set of features:
  • total reads
  • number of non-human reads
  • summary stats of the contig distribution
  • viral load, # viral taxa
  • bacterial load, # bacterial taxa
  • breakdown of aerobic / facultative / anaerobic bacteria
  • breakdown of gram positive / negative bacteria
  • recurrent AA motifs from discovery orfs_noblastp.fasta (comma separated list of motif_id's)
  • explore correlations between host expression and microbial summary features

Scientific Hygiene

  • design a 2D calibration panel to promote intuition about sensitivity vs. load vs. genome size etc.
  • how to give a probabilistic assignment of taxa present?

Long-Term Upgrades

  • support for SLURM
Clone this wiki locally