Prediction of anti-inflammatory peptides by a sequence-based stacking ensemble model named AIPStack
Hua Deng,
Chaofeng Lou,
Zengrui Wu,
Weihua Li,
Guixia Liu,
Yun Tang
Affiliations
Hua Deng
Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
Chaofeng Lou
Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
Zengrui Wu
Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
Weihua Li
Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
Guixia Liu
Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
Yun Tang
Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China; Corresponding author
Summary: Accurate and efficient identification of anti-inflammatory peptides (AIPs) is crucial for the treatment of inflammation. Here, we proposed a two-layer stacking ensemble model, AIPStack, to effectively predict AIPs. At first, we constructed a new dataset for model building and validation. Then, peptide sequences were represented by hybrid features, which were fused by two amino acid composition descriptors. Next, the stacking ensemble model was constructed by random forest and extremely randomized tree as the base-classifiers and logistic regression as the meta-classifier to receive the outputs from the base-classifiers. AIPStack achieved an AUC of 0.819, accuracy of 0.755, and MCC of 0.510 on the independent set 3, which were higher than other AIP predictors. Furthermore, the essential sequence features were highlighted by the Shapley Additive exPlanation (SHAP) method. It is anticipated that AIPStack could be used for AIP prediction in a high-throughput manner and facilitate the hypothesis-driven experimental design.