Genome Biology (Apr 2022)

Predicting RNA splicing from DNA sequence using Pangolin

  • Tony Zeng,
  • Yang I Li

DOI
https://doi.org/10.1186/s13059-022-02664-4
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 18

Abstract

Read online

Abstract Recent progress in deep learning has greatly improved the prediction of RNA splicing from DNA sequence. Here, we present Pangolin, a deep learning model to predict splice site strength in multiple tissues. Pangolin outperforms state-of-the-art methods for predicting RNA splicing on a variety of prediction tasks. Pangolin improves prediction of the impact of genetic variants on RNA splicing, including common, rare, and lineage-specific genetic variation. In addition, Pangolin identifies loss-of-function mutations with high accuracy and recall, particularly for mutations that are not missense or nonsense, demonstrating remarkable potential for identifying pathogenic variants.