BMC Bioinformatics (Apr 2023)

Validation of genetic variants from NGS data using deep convolutional neural networks

  • Marc Vaisband,
  • Maria Schubert,
  • Franz Josef Gassner,
  • Roland Geisberger,
  • Richard Greil,
  • Nadja Zaborsky,
  • Jan Hasenauer

DOI
https://doi.org/10.1186/s12859-023-05255-7
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 19

Abstract

Read online

Abstract Accurate somatic variant calling from next-generation sequencing data is one most important tasks in personalised cancer therapy. The sophistication of the available technologies is ever-increasing, yet, manual candidate refinement is still a necessary step in state-of-the-art processing pipelines. This limits reproducibility and introduces a bottleneck with respect to scalability. We demonstrate that the validation of genetic variants can be improved using a machine learning approach resting on a Convolutional Neural Network, trained using existing human annotation. In contrast to existing approaches, we introduce a way in which contextual data from sequencing tracks can be included into the automated assessment. A rigorous evaluation shows that the resulting model is robust and performs on par with trained researchers following published standard operating procedure.

Keywords