BMC Bioinformatics (Sep 2024)

FindCSV: a long-read based method for detecting complex structural variations

  • Yan Zheng,
  • Xuequn Shang

DOI
https://doi.org/10.1186/s12859-024-05937-w
Journal volume & issue
Vol. 25, no. 1
pp. 1 – 19

Abstract

Read online

Abstract Background Structural variations play a significant role in genetic diseases and evolutionary mechanisms. Extensive research has been conducted over the past decade to detect simple structural variations, leading to the development of well-established detection methods. However, recent studies have highlighted the potentially greater impact of complex structural variations on individuals compared to simple structural variations. Despite this, the field still lacks precise detection methods specifically designed for complex structural variations. Therefore, the development of a highly efficient and accurate detection method is of utmost importance. Result In response to this need, we propose a novel method called FindCSV, which leverages deep learning techniques and consensus sequences to enhance the detection of SVs using long-read sequencing data. Compared to current methods, FindCSV performs better in detecting complex and simple structural variations. Conclusions FindCSV is a new method to detect complex and simple structural variations with reasonable accuracy in real and simulated data. The source code for the program is available at https://github.com/nwpuzhengyan/FindCSV .

Keywords