Identification of essential proteins based on edge features and the fusion of multiple-source biological information

Peiqiang Liu; Chang Liu; Yanyan Mao; Junhong Guo; Fanshu Liu; Wangmin Cai; Feng Zhao

doi:10.1186/s12859-023-05315-y

BMC Bioinformatics (May 2023)

Identification of essential proteins based on edge features and the fusion of multiple-source biological information

Peiqiang Liu,
Chang Liu,
Yanyan Mao,
Junhong Guo,
Fanshu Liu,
Wangmin Cai,
Feng Zhao

Affiliations

Peiqiang Liu: School of Computer Science and Technology, Shandong Technology and Business University
Chang Liu: School of Computer Science and Technology, Shandong Technology and Business University
Yanyan Mao: School of Computer Science and Technology, Shandong Technology and Business University
Junhong Guo: School of Computer Science and Technology, Shandong Technology and Business University
Fanshu Liu: School of Computer Science and Technology, Shandong Technology and Business University
Wangmin Cai: School of Computer Science and Technology, Shandong Technology and Business University
Feng Zhao: School of Computer Science and Technology, Shandong Technology and Business University

DOI: https://doi.org/10.1186/s12859-023-05315-y
Journal volume & issue: Vol. 24, no. 1
pp. 1 – 24

Abstract

Read online

Abstract Background A major current focus in the analysis of protein–protein interaction (PPI) data is how to identify essential proteins. As massive PPI data are available, this warrants the design of efficient computing methods for identifying essential proteins. Previous studies have achieved considerable performance. However, as a consequence of the features of high noise and structural complexity in PPIs, it is still a challenge to further upgrade the performance of the identification methods. Methods This paper proposes an identification method, named CTF, which identifies essential proteins based on edge features including h-quasi-cliques and uv-triangle graphs and the fusion of multiple-source information. We first design an edge-weight function, named EWCT, for computing the topological scores of proteins based on quasi-cliques and triangle graphs. Then, we generate an edge-weighted PPI network using EWCT and dynamic PPI data. Finally, we compute the essentiality of proteins by the fusion of topological scores and three scores of biological information. Results We evaluated the performance of the CTF method by comparison with 16 other methods, such as MON, PeC, TEGS, and LBCC, the experiment results on three datasets of Saccharomyces cerevisiae show that CTF outperforms the state-of-the-art methods. Moreover, our method indicates that the fusion of other biological information is beneficial to improve the accuracy of identification.

Published in BMC Bioinformatics

ISSN: 1471-2105 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Biology (General)
Website: http://www.biomedcentral.com/bmcbioinformatics/

About the journal

Abstract

Keywords