Genome Biology (Aug 2023)

Towards in silico CLIP-seq: predicting protein-RNA interaction via sequence-to-signal learning

  • Marc Horlacher,
  • Nils Wagner,
  • Lambert Moyon,
  • Klara Kuret,
  • Nicolas Goedert,
  • Marco Salvatore,
  • Jernej Ule,
  • Julien Gagneur,
  • Ole Winther,
  • Annalisa Marsico

DOI
https://doi.org/10.1186/s13059-023-03015-7
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 37

Abstract

Read online

Abstract We present RBPNet, a novel deep learning method, which predicts CLIP-seq crosslink count distribution from RNA sequence at single-nucleotide resolution. By training on up to a million regions, RBPNet achieves high generalization on eCLIP, iCLIP and miCLIP assays, outperforming state-of-the-art classifiers. RBPNet performs bias correction by modeling the raw signal as a mixture of the protein-specific and background signal. Through model interrogation via Integrated Gradients, RBPNet identifies predictive sub-sequences that correspond to known and novel binding motifs and enables variant-impact scoring via in silico mutagenesis. Together, RBPNet improves imputation of protein-RNA interactions, as well as mechanistic interpretation of predictions.

Keywords