Scientific Reports (May 2022)

Using generative adversarial networks for genome variant calling from low depth ONT sequencing data

  • Han Yang,
  • Fei Gu,
  • Lei Zhang,
  • Xian-Sheng Hua

DOI
https://doi.org/10.1038/s41598-022-12346-7
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Genome variant calling is a challenging yet critical task for subsequent studies. Existing methods almost rely on high depth DNA sequencing data. Performance on low depth data drops a lot. Using public Oxford Nanopore (ONT) data of human being from the Genome in a Bottle (GIAB) Consortium, we trained a generative adversarial network for low depth variant calling. Our method, noted as LDV-Caller, can project high depth sequencing information from low depth data. It achieves 94.25% F1 score on low depth data, while the F1 score of the state-of-the-art method on two times higher depth data is 94.49%. By doing so, the price of genome-wide sequencing examination can reduce deeply. In addition, we validated the trained LDV-Caller model on 157 public Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) samples. The mean sequencing depth of these samples is 2982. The LDV-Caller yields 92.77% F1 score using only 22x sequencing depth, which demonstrates our method has potential to analyze different species with only low depth sequencing data.