Genome Biology (May 2023)

Introme accurately predicts the impact of coding and noncoding variants on gene splicing, with clinical applications

  • Patricia J. Sullivan,
  • Velimir Gayevskiy,
  • Ryan L. Davis,
  • Marie Wong,
  • Chelsea Mayoh,
  • Amali Mallawaarachchi,
  • Yvonne Hort,
  • Mark J. McCabe,
  • Sarah Beecroft,
  • Matilda R. Jackson,
  • Peer Arts,
  • Andrew Dubowsky,
  • Nigel Laing,
  • Marcel E. Dinger,
  • Hamish S. Scott,
  • Emily Oates,
  • Mark Pinese,
  • Mark J. Cowley

DOI
https://doi.org/10.1186/s13059-023-02936-7
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 18

Abstract

Read online

Abstract Predicting the impact of coding and noncoding variants on splicing is challenging, particularly in non-canonical splice sites, leading to missed diagnoses in patients. Existing splice prediction tools are complementary but knowing which to use for each splicing context remains difficult. Here, we describe Introme, which uses machine learning to integrate predictions from several splice detection tools, additional splicing rules, and gene architecture features to comprehensively evaluate the likelihood of a variant impacting splicing. Through extensive benchmarking across 21,000 splice-altering variants, Introme outperformed all tools (auPRC: 0.98) for the detection of clinically significant splice variants. Introme is available at https://github.com/CCICB/introme .

Keywords