BMC Genomics (Aug 2012)

Improving mapping and SNP-calling performance in multiplexed targeted next-generation sequencing

  • ElSharawy Abdou,
  • Forster Michael,
  • Schracke Nadine,
  • Keller Andreas,
  • Thomsen Ingo,
  • Petersen Britt-Sabina,
  • Stade Björn,
  • Stähler Peer,
  • Schreiber Stefan,
  • Rosenstiel Philip,
  • Franke Andre

DOI
https://doi.org/10.1186/1471-2164-13-417
Journal volume & issue
Vol. 13, no. 1
p. 417

Abstract

Read online

Abstract Background Compared to classical genotyping, targeted next-generation sequencing (tNGS) can be custom-designed to interrogate entire genomic regions of interest, in order to detect novel as well as known variants. To bring down the per-sample cost, one approach is to pool barcoded NGS libraries before sample enrichment. Still, we lack a complete understanding of how this multiplexed tNGS approach and the varying performance of the ever-evolving analytical tools can affect the quality of variant discovery. Therefore, we evaluated the impact of different software tools and analytical approaches on the discovery of single nucleotide polymorphisms (SNPs) in multiplexed tNGS data. To generate our own test model, we combined a sequence capture method with NGS in three experimental stages of increasing complexity (E. coli genes, multiplexed E. coli, and multiplexed HapMap BRCA1/2 regions). Results We successfully enriched barcoded NGS libraries instead of genomic DNA, achieving reproducible coverage profiles (Pearson correlation coefficients of up to 0.99) across multiplexed samples, with Conclusions We recommend applying our general ‘two-step’ mapping approach for more efficient SNP discovery in tNGS. Our study has also shown the benefit of computing inter-sample SNP-concordances and inspecting read alignments in order to attain more confident results.

Keywords