Frontiers in RNA Research (Oct 2024)

Predicting conserved functional interactions for long noncoding RNAs via deep learning

  • Megan B. Kratz,
  • Keriayn N. Smith,
  • Keriayn N. Smith

DOI
https://doi.org/10.3389/frnar.2024.1473293
Journal volume & issue
Vol. 2

Abstract

Read online

Long noncoding RNA (lncRNA) genes outnumber protein coding genes in the human genome and the majority remain uncharacterized. A major difficulty in generalizing understanding of lncRNA function is the dearth of gross sequence conservation, both for lncRNAs across species and for lncRNAs that perform similar functions within a species. Machine learning based methods which harness vast amounts of information on RNAs are increasingly used to impute certain biological characteristics. This includes interactions with proteins that are important mediators of RNA function, thus enabling the generation of knowledge in contexts for which experimental data are lacking. Here, we applied a natural language-based machine learning approach that enabled us to identify RNA binding protein interactions in lncRNA transcripts, using only RNA sequence as an input. We found that this predictive method is a powerful approach to infer conserved binding across species as distant as human and opossum, even in the absence of sequence conservation, thus informing on sequence-function relationships for these poorly understood RNAs.

Keywords