BMC Bioinformatics (Dec 2021)

BreakNet: detecting deletions using long reads and a deep learning approach

  • Junwei Luo,
  • Hongyu Ding,
  • Jiquan Shen,
  • Haixia Zhai,
  • Zhengjiang Wu,
  • Chaokun Yan,
  • Huimin Luo

DOI
https://doi.org/10.1186/s12859-021-04499-5
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background Structural variations (SVs) occupy a prominent position in human genetic diversity, and deletions form an important type of SV that has been suggested to be associated with genetic diseases. Although various deletion calling methods based on long reads have been proposed, a new approach is still needed to mine features in long-read alignment information. Recently, deep learning has attracted much attention in genome analysis, and it is a promising technique for calling SVs. Results In this paper, we propose BreakNet, a deep learning method that detects deletions by using long reads. BreakNet first extracts feature matrices from long-read alignments. Second, it uses a time-distributed convolutional neural network (CNN) to integrate and map the feature matrices to feature vectors. Third, BreakNet employs a bidirectional long short-term memory (BLSTM) model to analyse the produced set of continuous feature vectors in both the forward and backward directions. Finally, a classification module determines whether a region refers to a deletion. On real long-read sequencing datasets, we demonstrate that BreakNet outperforms Sniffles, SVIM and cuteSV in terms of their F1 scores. The source code for the proposed method is available from GitHub at https://github.com/luojunwei/BreakNet . Conclusions Our work shows that deep learning can be combined with long reads to call deletions more effectively than existing methods.