Genome Biology (Apr 2023)

Transformation of alignment files improves performance of variant callers for long-read RNA sequencing data

  • Vladimir B. C. de Souza,
  • Ben T. Jordan,
  • Elizabeth Tseng,
  • Elizabeth A. Nelson,
  • Karen K. Hirschi,
  • Gloria Sheynkman,
  • Mark D. Robinson

DOI
https://doi.org/10.1186/s13059-023-02923-y
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Long-read RNA sequencing (lrRNA-seq) produces detailed information about full-length transcripts, including novel and sample-specific isoforms. Furthermore, there is an opportunity to call variants directly from lrRNA-seq data. However, most state-of-the-art variant callers have been developed for genomic DNA. Here, there are two objectives: first, we perform a mini-benchmark on GATK, DeepVariant, Clair3, and NanoCaller primarily on PacBio Iso-Seq, data, but also on Nanopore and Illumina RNA-seq data; second, we propose a pipeline to process spliced-alignment files, making them suitable for variant calling with DNA-based callers. With such manipulations, high calling performance can be achieved using DeepVariant on Iso-seq data.