Clone the project from GitHub. Create and launch a virtual environment for python
python3 -m venv venv
source venv/bin/activate
and install dependencies
pip install -r requirements.txt
The application is run with the following command
python3 src/compana.py <arguments>
The application accepts the following CLI arguments:
Short | Long | Description |
---|---|---|
g | gffcompare_gtf | provide full path for gtf file produced by gffcompare . |
r | reference_gtf | provide full path for reference gtf file used to create the gffcompare gtf -file. |
i | isoquant_gtf | provide full path for IsoQuant transcript_model.gtf file |
a | reference_fasta | provide full path for reference FASTA-file (e.g. fa ) |
b | reads_bam | provide full path for reads provided to IsoQuant |
t | reads_tsv | provide full path for model_reads.tsv generated by IsoQuant |
f | force | force re-creation of sqlite3 database. By default a new database is not created if one already exists to improve efficiency. |
s | stats | output statistics on class codes. |
c | class-code | specify one or several class codes for which to generate offset data. |
o | offset | provide a closed range of offsets to be extracted from comparison results. Provide one or two values. If one value is given, the range will be set to (0, max(0, given value)) |
j | json | provide a filename for a json-file containing arguments. Example below. |
w | window_size | window from which indels are to be searched |
e | extended_debugging | enable extended debug output, to create more log-files |
m | min_reads_for_graph | threshold for the n of cases for creating images |
n | no_canonicals | canonical splice sites are not considered. Enables more aggressive error prediction and correction. |
v | very_conservative | canonical splice sites are considered, threshold must exceed, there must be a consentration of deletions |
The suggested way to run the application is to provide arguments in a json-format. An example template is provided in the root directory. For input files the best practice is to use absolute path.
{
"reads_bam": "<file.bam>",
"reads_tsv": "<file.tsv>",
"reference_gtf": "<file.gtf>",
"reference_fasta": "<file.fa>",
"gffcompare_gtf": "<file.gtf>",
"isoquant_gtf": "<file.gtf>",
"offset": [0, 6],
"class_code": "j c k s x",
"window_size": 8,
"min_reads_for_graph": 100,
"force": false,
"extended_debugging": false,
"no_canonicals": false,
"very_conservative": false
}
Examples
The following insruction extracts arguments from a given json file
python3 src/compana.py -j <arguments.json>
The following CLI-instruction contains the minimum arguments needed to run compAna:
python3 src/compana.py \
--gffcompare_gtf=gffcompare-file.gtf \
--reference_gtf=ref-file.gtf \
--reference_fasta=file.fa \
--reads_bam=file.bam \
--reads_tsv=file.tsv \
--offset="0 6"