BMC Biology (Apr 2023)
m5U-SVM: identification of RNA 5-methyluridine modification sites based on multi-view features of physicochemical features and distributed representation
Abstract
Abstract Background RNA 5-methyluridine (m5U) modifications are obtained by methylation at the C5 position of uridine catalyzed by pyrimidine methylation transferase, which is related to the development of human diseases. Accurate identification of m5U modification sites from RNA sequences can contribute to the understanding of their biological functions and the pathogenesis of related diseases. Compared to traditional experimental methods, computational methods developed based on machine learning with ease of use can identify modification sites from RNA sequences in an efficient and time-saving manner. Despite the good performance of these computational methods, there are some drawbacks and limitations. Results In this study, we have developed a novel predictor, m5U-SVM, based on multi-view features and machine learning algorithms to construct predictive models for identifying m5U modification sites from RNA sequences. In this method, we used four traditional physicochemical features and distributed representation features. The optimized multi-view features were obtained from the four fused traditional physicochemical features by using the two-step LightGBM and IFS methods, and then the distributed representation features were fused with the optimized physicochemical features to obtain the new multi-view features. The best performing classifier, support vector machine, was identified by screening different machine learning algorithms. Compared with the results, the performance of the proposed model is better than that of the existing state-of-the-art tool. Conclusions m5U-SVM provides an effective tool that successfully captures sequence-related attributes of modifications and can accurately predict m5U modification sites from RNA sequences. The identification of m5U modification sites helps to understand and delve into the related biological processes and functions.
Keywords