Recognition of RNA-Binding Protein by Fusion of Multi-view and Multi-label Learning

YANG Haitao, DENG Zhaohong, WANG Shitong

doi:10.3778/j.issn.1673-9418.2006096

Jisuanji kexue yu tansuo (Nov 2021)

Recognition of RNA-Binding Protein by Fusion of Multi-view and Multi-label Learning

YANG Haitao, DENG Zhaohong, WANG Shitong

Affiliations

YANG Haitao, DENG Zhaohong, WANG Shitong: School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China

DOI: https://doi.org/10.3778/j.issn.1673-9418.2006096
Journal volume & issue: Vol. 15, no. 11
pp. 2193 – 2205

Abstract

Read online

RNA-binding protein (RBP) is a total name of a class of proteins that bind to RNA (ribonucleic acid) along with the process of RNA??s regulation metabolic. An RBP may have multiple target RNAs, and its defective expression may cause various diseases. Existing methods are mostly designed for a specific RBP binary classification model to predict whether an RNA can bind to it. But these methods do not take into account the similarity and association between different RBPs. Therefore, iDeepM uses multi-label deep learning methods to improve it. This method fuses multi-label technology and long short term memory (LSTM) network, learns the similarity between different RBPs, and predicts the binding of a given RNA to multiple RBPs. However, this method fails to perform sufficient feature learning and multi-label learning on RNA sequences, and the prediction accuracy is low. This paper continues the research ideas of iDeepM multi-label, and proposes a new method RNA-RBP multiview learning (RRMVL). For the first time, the RNA sequence view, the amino acid sequence view, the RNA sequence semantic view and the multi-gap dipeptide component view are used to compose multi-view data to deal with multi-label RBP recognition. In order to use the different learning advantages of multi-view data, this paper fuses the deep features extracted from four views and uses the principle of logistic regression to learn multi-label features from them. After that, the learnt weighted feature vectors are fed to the multi-label classifier chain to achieve the optimal multi-label chain learning effect. Experimental studies show that the prediction accuracy of the RNA-binding protein recognition model combining multi-view and multi-label learning has been significantly improved compared with the previous single-view method.

Published in Jisuanji kexue yu tansuo

ISSN: 1673-9418 (Print)
Publisher: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://fcst.ceaj.org

About the journal

Abstract

Keywords