Genome Biology (Mar 2019)

MMSplice: modular modeling improves the predictions of genetic variant effects on splicing

  • Jun Cheng,
  • Thi Yen Duong Nguyen,
  • Kamil J. Cygan,
  • Muhammed Hasan Çelik,
  • William G. Fairbrother,
  • žiga Avsec,
  • Julien Gagneur

DOI
https://doi.org/10.1186/s13059-019-1653-z
Journal volume & issue
Vol. 20, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Predicting the effects of genetic variants on splicing is highly relevant for human genetics. We describe the framework MMSplice (modular modeling of splicing) with which we built the winning model of the CAGI5 exon skipping prediction challenge. The MMSplice modules are neural networks scoring exon, intron, and splice sites, trained on distinct large-scale genomics datasets. These modules are combined to predict effects of variants on exon skipping, splice site choice, splicing efficiency, and pathogenicity, with matched or higher performance than state-of-the-art. Our models, available in the repository Kipoi, apply to variants including indels directly from VCF files.

Keywords