IEEE Access (Jan 2020)
A Novel Model for Predicting Essential Proteins Based on Heterogeneous Protein-Domain Network
Abstract
Essential proteins play significant roles in cell survive. In current years, some Protein-Protein Interaction (PPI) data have been discovered in saccharomyces cerevisiae. Due to the high costs of biological experiments, a growing number of computational models are adopted to predict essential proteins. However, the identification accuracy of these computational models still has broad space for improvement. In this paper, a novel prediction model called NPRI is proposed to infer potential essential proteins based on the PageRank algorithm. In NPRI, a new heterogeneous Protein-Domain network will be constructed by integrating three kinds of networks such as the weighted PPI network, the Domain-Domain network and the initial Protein-Domain network first. Here, these three kinds of networks are established in accordance with gene expression data, original PPI network and known Protein-Domain network respectively. Next, based on the newly constructed heterogeneous Protein-Domain network, we will extract functional features and topological characteristics for each protein to further construct a novel distribution rate network. And then, an improved iteration method based on the PageRank algorithm will be implemented on the novel distribution rate network to infer essential proteins. Finally, in order to evaluate the performance of NPRI, we will compare NPRI with other state-of-the-art prediction models, and simulation results show that NPRI can achieve reliable identification accuracies of 90%, 84.5% and 79% in top 100, 200 and 300 predicted candidate essential proteins separately, which outperform these competitive models remarkably, and means that NPRI is a promising framework for identifying essential proteins as well.
Keywords