Journal of Bioinformatics and Genomics (Dec 2024)

MACHINE LEARNING USING MULTIPLE LOGISTIC REGRESSION FOR ANTIMICROBIAL AND HEMOLYTIC PEPTIDES PREDICTION AND THEIR IDENTIFICATION IN LARGE PROTEINS

  • Krenev I.A.

DOI
https://doi.org/10.60797/jbg.2024.26.5
Journal volume & issue
Vol. 26, no. 4

Abstract

Read online

Antimicrobial peptides (AMPs) are considered as a promising pool of alternative antimicrobial agents in the post-antibiotic era. Since a number of limitations, especially cytotoxicity, restrict their implementation into clinic, search for novel non-toxic AMPs is of high relevance. In the present study, we used multiple logistic regression for prediction of both antimicrobial and hemolytic capacities of peptides. The two constructed models demonstrated acceptable predictive power (at estimated optimal cut-offs, accuracy, sensitivity, specificity, F-measure ≥ 0.82, ROC AUC > 0.91). The model for antimicrobial activity prediction was further applied for identification of possible AMPs in large protein sequences. The validation of the method was performed on precursors of well-known AMPs from different structural classes – human neutrophil peptide 1 (HNP1), LL-37 cathelicidin as well as of tachyplesin I. In all cases, the mature AMPs localization was predicted correctly, i.e. at the C-terminus (HNP1, LL-37) or in the middle of the precursor sequence (tachyplesin I). The study provides the easy-for-interpretation method for prediction of antimicrobial and hemolytic peptides and their identification in large proteins.

Keywords