Genome Biology (Apr 2020)

ExpansionHunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data

  • Egor Dolzhenko,
  • Mark F. Bennett,
  • Phillip A. Richmond,
  • Brett Trost,
  • Sai Chen,
  • Joke J. F. A. van Vugt,
  • Charlotte Nguyen,
  • Giuseppe Narzisi,
  • Vladimir G. Gainullin,
  • Andrew M. Gross,
  • Bryan R. Lajoie,
  • Ryan J. Taft,
  • Wyeth W. Wasserman,
  • Stephen W. Scherer,
  • Jan H. Veldink,
  • David R. Bentley,
  • Ryan K. C. Yuen,
  • Melanie Bahlo,
  • Michael A. Eberle

DOI
https://doi.org/10.1186/s13059-020-02017-z
Journal volume & issue
Vol. 21, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Repeat expansions are responsible for over 40 monogenic disorders, and undoubtedly more pathogenic repeat expansions remain to be discovered. Existing methods for detecting repeat expansions in short-read sequencing data require predefined repeat catalogs. Recent discoveries emphasize the need for methods that do not require pre-specified candidate repeats. To address this need, we introduce ExpansionHunter Denovo, an efficient catalog-free method for genome-wide repeat expansion detection. Analysis of real and simulated data shows that our method can identify large expansions of 41 out of 44 pathogenic repeats, including nine recently reported non-reference repeat expansions not discoverable via existing methods.

Keywords