Genome Biology (Jul 2024)

Comprehensive and deep evaluation of structural variation detection pipelines with third-generation sequencing data

  • Zhi Liu,
  • Zhi Xie,
  • Miaoxin Li

DOI
https://doi.org/10.1186/s13059-024-03324-5
Journal volume & issue
Vol. 25, no. 1
pp. 1 – 25

Abstract

Read online

Abstract Background Structural variation (SV) detection methods using third-generation sequencing data are widely employed, yet accurately detecting SVs remains challenging. Different methods often yield inconsistent results for certain SV types, complicating tool selection and revealing biases in detection. Results This study comprehensively evaluates 53 SV detection pipelines using simulated and real data from PacBio (CLR: Continuous Long Read, CCS: Circular Consensus Sequencing) and Nanopore (ONT) platforms. We assess their performance in detecting various sizes and types of SVs, breakpoint biases, and genotyping accuracy with various sequencing depths. Notably, pipelines such as Minimap2-cuteSV2, NGMLR-SVIM, PBMM2-pbsv, Winnowmap-Sniffles2, and Winnowmap-SVision exhibit comparatively higher recall and precision. Our findings also show that combining multiple pipelines with the same aligner, like pbmm2 or winnowmap, can significantly enhance performance. The individual pipelines’ detailed ranking and performance metrics can be viewed in a dynamic table: http://pmglab.top/SVPipelinesRanking . Conclusions This study comprehensively characterizes the strengths and weaknesses of numerous pipelines, providing valuable insights that can improve SV detection in third-generation sequencing data and inform SV annotation and function prediction.

Keywords