BMC Bioinformatics (Nov 2023)

A seed expansion-based method to identify essential proteins by integrating protein–protein interaction sub-networks and multiple biological characteristics

  • He Zhao,
  • Guixia Liu,
  • Xintian Cao

DOI
https://doi.org/10.1186/s12859-023-05583-8
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 29

Abstract

Read online

Abstract Background The identification of essential proteins is of great significance in biology and pathology. However, protein–protein interaction (PPI) data obtained through high-throughput technology include a high number of false positives. To overcome this limitation, numerous computational algorithms based on biological characteristics and topological features have been proposed to identify essential proteins. Results In this paper, we propose a novel method named SESN for identifying essential proteins. It is a seed expansion method based on PPI sub-networks and multiple biological characteristics. Firstly, SESN utilizes gene expression data to construct PPI sub-networks. Secondly, seed expansion is performed simultaneously in each sub-network, and the expansion process is based on the topological features of predicted essential proteins. Thirdly, the error correction mechanism is based on multiple biological characteristics and the entire PPI network. Finally, SESN analyzes the impact of each biological characteristic, including protein complex, gene expression data, GO annotations, and subcellular localization, and adopts the biological data with the best experimental results. The output of SESN is a set of predicted essential proteins. Conclusions The analysis of each component of SESN indicates the effectiveness of all components. We conduct comparison experiments using three datasets from two species, and the experimental results demonstrate that SESN achieves superior performance compared to other methods.

Keywords