Scientific Reports (Sep 2021)
Taxonomic bias in AMP prediction of invertebrate peptides
Abstract
Abstract Invertebrate antimicrobial peptides (AMPs) are at the forefront in the search for agents of therapeutic utility against multi-resistant microbial pathogens, and in recent years substantial advances took place in the in silico prediction of antimicrobial function of amino acid sequences. A yet neglected aspect is taxonomic bias in the performance of these tools. Owing to differences in the prediction algorithms and used training data sets between tools, and phylogenetic differences in sequence diversity, physicochemical properties and evolved biological functions of AMPs between taxa, notable discrepancies may exist in performance between the currently available prediction tools. Here we tested if there is taxonomic bias in the prediction power in 10 tools with a total of 20 prediction algorithms in 19 invertebrate taxa, using a data set containing 1525 AMP and 3050 non-AMP sequences. We found that most of the tools exhibited considerable variation in performance between tested invertebrate groups. Based on the per-taxa performances and on the variation in performances across taxa we provide guidance in choosing the best-performing prediction tool for all assessed taxa, by listing the highest scoring tool for each of them.