Skip to content

Technology migration for better restartability

Pre-release
Pre-release
Compare
Choose a tag to compare
@cyenyxe cyenyxe released this 17 Oct 12:23
· 907 commits to master since this release

Version 2.0 of the EVA pipeline will move from Luigi to Spring Batch. Instead of tracking progress of steps as a whole (done / not done), Spring Batch splits the work in chunks of configurable size. This way, if a step has processed millions of variants before failing, it will be resumed from that point instead of completely restarted.

The functionality implemented for this first beta includes:

  • Normalization of variants reported in a VCF file
  • Storage of variants in MongoDB
  • Calculation of allele frequencies and other statistics for all the samples in a VCF file
  • Annotation using Ensembl Variant Effect Predictor

Future beta releases will include support for population statistics via a PED file and improved usability.