Genome Biology (Oct 2021)

SquiggleNet: real-time, direct classification of nanopore signals

  • Yuwei Bao,
  • Jack Wadden,
  • John R. Erb-Downward,
  • Piyush Ranjan,
  • Weichen Zhou,
  • Torrin L. McDonald,
  • Ryan E. Mills,
  • Alan P. Boyle,
  • Robert P. Dickson,
  • David Blaauw,
  • Joshua D. Welch

DOI
https://doi.org/10.1186/s13059-021-02511-y
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 16

Abstract

Read online

Abstract We present SquiggleNet, the first deep-learning model that can classify nanopore reads directly from their electrical signals. SquiggleNet operates faster than DNA passes through the pore, allowing real-time classification and read ejection. Using 1 s of sequencing data, the classifier achieves significantly higher accuracy than base calling followed by sequence alignment. Our approach is also faster and requires an order of magnitude less memory than alignment-based approaches. SquiggleNet distinguished human from bacterial DNA with over 90% accuracy, generalized to unseen bacterial species in a human respiratory meta genome sample, and accurately classified sequences containing human long interspersed repeat elements.

Keywords