Skip to content

Commit

Permalink
0.8.1 release
Browse files Browse the repository at this point in the history
  • Loading branch information
sigven committed May 22, 2019
1 parent a34c4aa commit ac0c8d2
Show file tree
Hide file tree
Showing 43 changed files with 313 additions and 136 deletions.
17 changes: 10 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ The Personal Cancer Genome Reporter (PCGR) is a stand-alone software package for
![PCGR overview](PCGR_workflow.png)

### News
* _May 22nd 2019_: **0.8.1 release**
* Added *Cancer_NOS.toml* for unspecified tumor types
* Minor bugfixing
* _May 20th 2019_: **0.8.0 release**
* Bundle update (VEP, CIViC, UniProt, CancerMine, dbNSFP, OpenTargets, DisGeNET, TCGA, ICGC-PCAWG)
* New functionality
Expand Down Expand Up @@ -122,7 +125,7 @@ c. Pull the [PCGR Docker image (*dev*)](https://hub.docker.com/r/sigven/pcgr/) f

##### Latest release

a. Download and unpack the [latest software release (0.8.0)](https://github.com/sigven/pcgr/releases/tag/v0.8.0)
a. Download and unpack the [latest software release (0.8.1)](https://github.com/sigven/pcgr/releases/tag/v0.8.1)

b. Download and unpack the assembly-specific data bundle in the PCGR directory
* [grch37 data bundle - 20190519](https://drive.google.com/open?id=1vIESS8NxiITUnrqZoWOdNk1YsklH8f1C) (approx 15Gb)
Expand All @@ -131,8 +134,8 @@ b. Download and unpack the assembly-specific data bundle in the PCGR directory

A _data/_ folder within the _pcgr-X.X_ software folder should now have been produced

c. Pull the [PCGR Docker image (0.8.0)](https://hub.docker.com/r/sigven/pcgr/) from DockerHub (approx 5.2Gb):
* `docker pull sigven/pcgr:0.8.0` (PCGR annotation engine)
c. Pull the [PCGR Docker image (0.8.1)](https://hub.docker.com/r/sigven/pcgr/) from DockerHub (approx 5.2Gb):
* `docker pull sigven/pcgr:0.8.1` (PCGR annotation engine)

#### STEP 3: Input preprocessing

Expand Down Expand Up @@ -189,7 +192,7 @@ A tumor sample report is generated by calling the Python script __pcgr.py__, whi

positional arguments:
pcgr_dir PCGR base directory with accompanying data directory,
e.g. ~/pcgr-0.8.0
e.g. ~/pcgr-0.8.1
output_dir Output directory
{grch37,grch38} Genome assembly build: grch37 or grch38
configuration_file PCGR configuration file (TOML format, in conf/ folder)
Expand Down Expand Up @@ -236,9 +239,9 @@ A tumor sample report is generated by calling the Python script __pcgr.py__, whi

The _examples_ folder contain input files from two tumor samples sequenced within TCGA (**GRCh37** only). It also contains PCGR configuration files customized for these cases. A report for a colorectal tumor case can be generated by running the following command in your terminal window:

`python pcgr.py --input_vcf ~/pcgr-0.8.0/examples/tumor_sample.COAD.vcf.gz`
`--input_cna ~/pcgr-0.8.0/examples/tumor_sample.COAD.cna.tsv --tumor_purity 0.9 --tumor_ploidy 2.0`
` ~/pcgr-0.8.0 ~/pcgr-0.8.0/examples grch37 ~/pcgr-0.8.0/examples/examples_COAD.toml tumor_sample.COAD`
`python pcgr.py --input_vcf ~/pcgr-0.8.1/examples/tumor_sample.COAD.vcf.gz`
`--input_cna ~/pcgr-0.8.1/examples/tumor_sample.COAD.cna.tsv --tumor_purity 0.9 --tumor_ploidy 2.0`
` ~/pcgr-0.8.1 ~/pcgr-0.8.1/examples grch37 ~/pcgr-0.8.1/examples/examples_COAD.toml tumor_sample.COAD`


This command will run the Docker-based PCGR workflow and produce the following output files in the _examples_ folder:
Expand Down
128 changes: 128 additions & 0 deletions conf/Cancer_NOS.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
# Basic PCGR configuration options (TOML).

[tumor_only]
## If input VCF contains mix of germline/somatic (variants called with no matching control, i.e. tumor-only) set vcf_tumor_only to true
vcf_tumor_only = false

## If vcf_tumor_only = true, several filters can be configured, all as a means to minimize the proportion of germline calls in the raw set derived from tumor-only calling

## Exclude variants (SNVs/InDels) with minor allele frequency above the following population-specific thresholds
## 1000 Genomes Project - WGS data
maf_onekg_eur = 0.002
maf_onekg_amr = 0.002
maf_onekg_afr = 0.002
maf_onekg_sas = 0.002
maf_onekg_eas = 0.002
maf_onekg_global = 0.002

## exclude variants with minor allele frequency above the following population-specific thresholds
## gnomAD - WES data
maf_gnomad_nfe = 0.002
maf_gnomad_amr = 0.002
maf_gnomad_afr = 0.002
maf_gnomad_asj = 0.002
maf_gnomad_sas = 0.002
maf_gnomad_eas = 0.002
maf_gnomad_fin = 0.002
maf_gnomad_oth = 0.002
maf_gnomad_global = 0.002

## Exclude variants occurring in PoN (panel of normals, if provided as VCF)
exclude_pon = true

## Exclude likely homozygous germline variants (100% allelic fraction for alternate allele in tumor, very unlikely somatic event)
exclude_likely_hom_germline = false

## Exclude likely heterozygous germline variants
## Must satisfy i) 40-60 % allelic fraction for alternate allele in tumor sample, ii) present in dbSNP + gnomAD, ii) not existing as somatic event in COSMIC/TCGA
## Note that the application of this filter may be suboptimal for very impure tumors or variants affected by CNAs etc (under these circumstances, the allelic fraction
## will be skewed (see e.g. discussion in PMID:29249243)
exclude_likely_het_germline = false

## Exclude variants found in dbSNP (only those that are NOT found in ClinVar(somatic origin)/DoCM/TCGA/COSMIC)
exclude_dbsnp_nonsomatic = false

## exclude all non-exonic variants
exclude_nonexonic = true

[allelic_support]
## Specify INFO tags in input VCF that denotes depth/allelic fraction in tumor and normal sample
## An additional tag that denotes call confidence (call_conf_tag) can also be specified, which will
## be used for exploration in the global variant browser. Note that 'tumor_dp_tag' must be of
## Type=Integer, and 'tumor_af_tag' must be of Type=Float (similarly for normal sample)
tumor_dp_tag = ""
tumor_af_tag = ""
control_dp_tag = ""
control_af_tag = ""
call_conf_tag = ""

## set thresholds for tumor depth/allelic fraction, will be applied before report generation
## will only apply if 'tumor_dp_tag' and 'tumor_af_tag' are specified above (similarly
## for 'control_dp_tag' and 'control_af_tag'
tumor_dp_min = 0
tumor_af_min = 0.0
control_dp_min = 0
control_af_max = 1.0

[mutational_burden]
## Calculate mutational burden (similar to Chalmers et al., Genome Med, 2017)
mutational_burden = true
## Size of coding target region in megabases (defaults to size of protein-coding regions of GENCODE ~ 34 Mb)
## Note: this should ideally denote the callable target size (i.e. reflecting variable sequencing depth)
target_size_mb = 34.0
## set upper limits to tumor mutational burden tertiles (mutations/Mb)
tmb_low_limit = 5
tmb_intermediate_limit = 20
## tmb_high = tmb > tmb_intermediate_limit

[cna]
## log ratio thresholds for determination of copy number gains and homozygous deletions
logR_gain = 0.8
logR_homdel = -0.8

## mean percent overlap between copy number segment and gene transcripts for reporting of gains/losses in tumor suppressor genes/oncogenes
cna_overlap_pct = 50

[msi]
## Predict microsatellite instability
msi = true

[mutational_signatures]
## Identify relative contribution of 30 known mutational signatures (COSMIC) through the deconstructSigs framework
mutsignatures = true
## deconstructSigs option: number of mutational signatures to limit the search to ('signatures.limit' in whichSignatures)
mutsignatures_signature_limit = 6
## deconstructSigs option: type of trimer count normalization for inference of known mutational signatures, see explanation at https://github.com/raerose01/deconstructSigs"
## options = 'default', 'exome', 'genome', 'exome2genome'
## NOTE: If your data (VCF) is from exome sequencing, 'default' or 'exome2genome' should be used. See https://github.com/raerose01/deconstructSigs/issues/2
mutsignatures_normalization = "exome2genome"
## Require a minimum number of mutations for signature estimation
mutsignatures_mutation_limit = 100
## deconstructSigs option: discard any signature contributions with a weight less than this amount
mutsignatures_cutoff = 0.06

[visual]
## Choose visual theme of report, any of: "default", "cerulean", "journal", "flatly", "readable", "spacelab", "united", "cosmo", "lumen", "paper", "sandstone", "simplex", or "yeti" (https://bootswatch.com/)
report_theme = "default"

[custom_tags]
## list VCF info tags that should be present in JSON and TSV output
## tags should be comma separated, i.e. custom_tags = "MUTECT2_FILTER,STRELKA_FILTER"
custom_tags = ""

[other]
## list/do not list noncoding variants
list_noncoding = true
## VEP/vcfanno processing options
n_vcfanno_proc = 4
n_vep_forks = 4
## Customise the order of criteria used to pick the primary transcript in VEP (see https://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_pick_order)
vep_pick_order = "canonical,appris,biotype,ccds,rank,tsl,length"
## omit intergenic variants during VEP processing
vep_skip_intergenic = false
## generate a MAF for input VCF using https://github.com/mskcc/vcf2maf
vcf2maf = true

## Not for edit
[tumor_type]
type = ""
5 changes: 5 additions & 0 deletions docs/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@

## CHANGELOG

#### 0.8.1 - May 22nd 2019

##### Added
* *Cancer_NOS.toml* as configuration file for unspecified tumor types

#### 0.8.0 - May 20th 2019

##### Fixed
Expand Down
16 changes: 13 additions & 3 deletions docs/CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,14 @@
CHANGELOG
---------

0.8.1 - May 22nd 2019
^^^^^^^^^^^^^^^^^^^^^

Added
'''''

- *Cancer_NOS.toml* as configuration file for unspecified tumor types

0.8.0 - May 20th 2019
^^^^^^^^^^^^^^^^^^^^^

Expand All @@ -10,6 +18,8 @@ Fixed
- Bug in value box for Tier 2 variants (new line carriage) `Issue
#73 <https://github.com/sigven/pcgr/issues/73>`__

.. _added-1:

Added
'''''

Expand Down Expand Up @@ -183,7 +193,7 @@ Fixed
- Removed ‘COSM’ prefix in COSMIC mutation links
- Bug in retrieval of splice site predictions from dbscSNV

.. _added-1:
.. _added-2:

Added
'''''
Expand Down Expand Up @@ -268,7 +278,7 @@ Fixed
- Bug in copy number annotation (missing protein-coding transcripts)
- Updated MSI prediction (variable importance, performance measures)

.. _added-2:
.. _added-3:

Added
'''''
Expand Down Expand Up @@ -300,7 +310,7 @@ Fixed
0.6.0 - April 25th 2018
^^^^^^^^^^^^^^^^^^^^^^^

.. _added-3:
.. _added-4:

Added
'''''
Expand Down
Binary file modified docs/_build/doctrees/CHANGELOG.doctree
Binary file not shown.
Binary file modified docs/_build/doctrees/about.doctree
Binary file not shown.
Binary file modified docs/_build/doctrees/environment.pickle
Binary file not shown.
Binary file modified docs/_build/doctrees/getting_started.doctree
Binary file not shown.
2 changes: 1 addition & 1 deletion docs/_build/html/.buildinfo
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 1b5753ec6635113bf8ca74c36d47b914
config: 21394eecc26621110784fc2eac7a29dc
tags: 645f666f9bcd5a90fca523b33c5a78b7
Loading

0 comments on commit ac0c8d2

Please sign in to comment.