DeepConsensus 1.0.0
- DeepConsensus v1.0 introduces a new model that greatly improves the empirical Q30 yield across chemistries and the insert sizes we tested. For example, using our chem2.2_24kb dataset we observe an increase in Q30 yield from 149% to 176%.
- We reduced the size of our model (using distillation) and the size of the model inputs to lower runtime by approximately 10%, while still improving accuracy over v0.3.
- DeepConsensus can now output a BAM file. BAM output can be used to examine the effective coverage (ec), number of passes (np), or predicted average read accuracy (rq).
- v1.0 introduces a training tutorial that users can use as a proof-of-concept to develop a training setup.
- Models introduced previously (v0.1, v0.2, v0.3) are not compatible with v1.0 and vice versa.
--max_passes
and--example_width
are now defined by the modelparams.json
file. Users do not need to set these flags when running inference. The--padding
flag has been removed. Padding is no longer added to model inputs.
Acknowledgements
- Thanks to Armin Töpfer (@armintoepfer), Aaron Wenger (@amwenger), and William Rowell (@williamrowell) at PacBio for advice and collaboration.
- Thanks to Lucas Brambrink (@lucasbrambrink) for model experiments and analysis.
- Thanks to Daniel Liu (@Daniel-Liu-c0deb0t) for model experiments, analysis, and advice.