IEEE Access (Jan 2024)

Unveiling the Potential Pattern Representation of RNA 5-Methyluridine Modification Sites Through a Novel Feature Fusion Model Leveraging Convolutional Neural Network and Tetranucleotide Composition

  • Waleed Alam,
  • Muhammad Tahir,
  • Shahid Hussain,
  • Sarah Gul,
  • Maqsood Hayat,
  • Reyazur Rashid Irshad,
  • Fabiano Pallonetto

DOI
https://doi.org/10.1109/ACCESS.2024.3352823
Journal volume & issue
Vol. 12
pp. 10023 – 10035

Abstract

Read online

The 5-Methyluridine (m5U), predominantly present in RNA and especially enriched in transfer RNA (tRNA), significantly enhances translational accuracy and protein synthesis by ensuring precise genetic information decoding and optimal tRNA functionality within cellular mechanisms. The identification of m5U modification sites is crucial, as this modification has gained significant attention in diseases such as breast cancer, stress response, and viral infections, offering insights into its molecular mechanisms and regulatory functions in disease contexts. Nevertheless, due to the arduous nature, intricate procedures, reliance on sophisticated and expensive instrumentation, and the need for specialized expertise, conventional biochemical approaches for identifying m5U modification sites result in substantial resource expenditures and notable temporal investments. Consequently, the pressing need for a precise and efficient computational method highlights the urgency for alternative approaches in identifying m5U modification sites. In this study, we introduce a novel computational approach called “Deep-m5U,” which combines the strengths of Convolutional Neural Networks (CNNs) and tetranucleotide composition to accurately identify methyluridine modification sites and improve overall performance. The developed Deep-m5U method leverages CNNs to accurately detect protein-coding regions aand capture relevant motifs, while incorporating tetra-nucleotide composition to capture global compositional characteristics, resulting in a more robust model that significantly enhances performance. We evaluated the Deep-m5U model on two publicly available benchmark datasets: the full transcript and mature mRNA datasets. Our results showcase superior performance, achieving accuracies of 91.26% and 95.63% respectively, surpassing the current cutting-edge methods. Moreover, the open-source code for Deep-m5U is freely accessible at: https://github.com/waleed551/Deep-m5U.

Keywords