Jisuanji kexue (Aug 2021)

Identifying Essential Proteins by Hybrid Deep Learning Model

  • LIU Wen-yang, GUO Yan-bu, LI Wei-hua

DOI
https://doi.org/10.11896/jsjkx.200700076
Journal volume & issue
Vol. 48, no. 8
pp. 240 – 245

Abstract

Read online

Essential proteins are those proteins that are essential to the viability of the organism.The identification of essential proteins helps to understand the minimum requirements of cell life,discover disease-causing genes and drug targets,and is of great significance for the diagnosis and treatment of diseases and drug design.Existing methods show that integrating protein interaction networks and the relevant features of sequences can improve the accuracy and robustness of essential proteins identification.In this paper,gene expression profiles,protein interaction networks and subcellular location information are integrated,and a hybrid neural network model IEPHDL is designed.The IEPHDL model uses bidirectional gated recurrent unit to perform feature learning on gene expression profiles for the first time,and uses a deep neural network composed of multiple fully connected layers to perform deep relearning of three data features,to give full play to the advantages of bidirectional gated recurrent unit network,fully connected network and Node2vec in feature learning and representation,to achieve effective identification of essential proteins.Experiment results show that,IEPHDL has an accuracy of 88.7% for essential protein identification,an precision of 86.2%,and an AUC of 85.2%.The accuracy is 13%,8.9%,3.8% higher than the current optimal centrality method,machine learning method,and deep learning method in turn,and other indicators are also higher than the three methods.Finally,through experimental analysis,it is confirmed that the bidirectional gated recurrent unit network relies on its strong feature learning ability and plays a key role in essential protein identification.

Keywords