Genome Biology (Dec 2019)

Paragraph: a graph-based structural variant genotyper for short-read sequence data

  • Sai Chen,
  • Peter Krusche,
  • Egor Dolzhenko,
  • Rachel M. Sherman,
  • Roman Petrovski,
  • Felix Schlesinger,
  • Melanie Kirsche,
  • David R. Bentley,
  • Michael C. Schatz,
  • Fritz J. Sedlazeck,
  • Michael A. Eberle

DOI
https://doi.org/10.1186/s13059-019-1909-7
Journal volume & issue
Vol. 20, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Accurate detection and genotyping of structural variations (SVs) from short-read data is a long-standing area of development in genomics research and clinical sequencing pipelines. We introduce Paragraph, an accurate genotyper that models SVs using sequence graphs and SV annotations. We demonstrate the accuracy of Paragraph on whole-genome sequence data from three samples using long-read SV calls as the truth set, and then apply Paragraph at scale to a cohort of 100 short-read sequenced samples of diverse ancestry. Our analysis shows that Paragraph has better accuracy than other existing genotypers and can be applied to population-scale studies.

Keywords