Frontiers in Cell and Developmental Biology (Oct 2020)

Identifying Antioxidant Proteins by Using Amino Acid Composition and Protein-Protein Interactions

  • Yixiao Zhai,
  • Yu Chen,
  • Zhixia Teng,
  • Yuming Zhao

DOI
https://doi.org/10.3389/fcell.2020.591487
Journal volume & issue
Vol. 8

Abstract

Read online

Excessive oxidative stress responses can threaten our health, and thus it is essential to produce antioxidant proteins to regulate the body’s oxidative responses. The low number of antioxidant proteins makes it difficult to extract their representative features. Our experimental method did not use structural information but instead studied antioxidant proteins from a sequenced perspective while focusing on the impact of data imbalance on sensitivity, thus greatly improving the model’s sensitivity for antioxidant protein recognition. We developed a method based on the Composition of k-spaced Amino Acid Pairs (CKSAAP) and the Conjoint Triad (CT) features derived from the amino acid composition and protein-protein interactions. SMOTE and the Max-Relevance-Max-Distance algorithm (MRMD) were utilized to unbalance the training data and select the optimal feature subset, respectively. The test set used 10-fold crossing validation and a random forest algorithm for classification according to the selected feature subset. The sensitivity was 0.792, the specificity was 0.808, and the average accuracy was 0.8.

Keywords