PLoS ONE (Jan 2021)

Multithreaded variant calling in elPrep 5.

  • Charlotte Herzeel,
  • Pascal Costanza,
  • Dries Decap,
  • Jan Fostier,
  • Roel Wuyts,
  • Wilfried Verachtert

DOI
https://doi.org/10.1371/journal.pone.0244471
Journal volume & issue
Vol. 16, no. 2
p. e0244471

Abstract

Read online

We present elPrep 5, which updates the elPrep framework for processing sequencing alignment/map files with variant calling. elPrep 5 can now execute the full pipeline described by the GATK Best Practices for variant calling, which consists of PCR and optical duplicate marking, sorting by coordinate order, base quality score recalibration, and variant calling using the haplotype caller algorithm. elPrep 5 produces identical BAM and VCF output as GATK4 while significantly reducing the runtime by parallelizing and merging the execution of the pipeline steps. Our benchmarks show that elPrep 5 speeds up the runtime of the variant calling pipeline by a factor 8-16x on both whole-exome and whole-genome data while using the same hardware resources as GATK4. This makes elPrep 5 a suitable drop-in replacement for GATK4 when faster execution times are needed.