IEEE Access (Jan 2019)

Identification of Human Membrane Protein Types by Incorporating Network Embedding Methods

  • Xiaolin Zhang,
  • Lei Chen,
  • Zi-Han Guo,
  • Haiyan Liang

DOI
https://doi.org/10.1109/ACCESS.2019.2944177
Journal volume & issue
Vol. 7
pp. 140794 – 140805

Abstract

Read online

Membrane protein is an important type of proteins and has been confirmed to play essential roles in various cellular processes. Based on their intramolecular arrangements and positions in a cell, they can be categorized into several types. However, it is time- and cost-consuming to recognize the type of a given membrane protein via traditional biophysical methods. In view of this, several computational models have been proposed in recent years. Most models adopted various information of membrane proteins, such as their sequences, domain profiles, physiochemical properties, etc. to extract different features, which were fed into downstream classification algorithms. In this study, we built two novel prediction models, which incorporated novel feature extraction methods, i.e., network embedding methods. To this end, several protein networks were constructed using the protein-protein interaction information retrieved from STRING. Among these models, one model was constructed based on features obtained by applying Mashup on seven protein networks, another model was built using features yielded by Node2Vec on one comprehensive protein network. Each model adopted random forest as the classification algorithm and employed the Synthetic Minority Over-sampling Technique (SMOTE) to overcome the influence yielded by the great difference on sizes of different membrane protein types. Furthermore, two models were integrated into one model to improve the predicted quality. The test results shown that the integrated model had good performance and was superior to any individual model. Also, we compared our models with some previous models, suggesting that our models were competitive.

Keywords