PeerJ Computer Science (Oct 2024)

Prediction of antigenic peptides of SARS- CoV-2 pathogen using machine learning

  • Syed Nisar Hussain Bukhari,
  • Kingsley A. Ogudo

DOI
https://doi.org/10.7717/peerj-cs.2319
Journal volume & issue
Vol. 10
p. e2319

Abstract

Read online Read online

Antigenic peptides (APs), also known as T-cell epitopes (TCEs), represent the immunogenic segment of pathogens capable of inducing an immune response, making them potential candidates for epitope-based vaccine (EBV) design. Traditional wet lab methods for identifying TCEs are expensive, challenging, and time-consuming. Alternatively, computational approaches employing machine learning (ML) techniques offer a faster and more cost-effective solution. In this study, we present a robust XGBoost ML model for predicting TCEs of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus as potential vaccine candidates. The peptide sequences comprising TCEs and non-TCEs retrieved from Immune Epitope Database Repository (IEDB) were subjected to feature extraction process to extract their physicochemical properties for model training. Upon evaluation using a test dataset, the model achieved an impressive accuracy of 97.6%, outperforming other ML classifiers. Employing a five-fold cross-validation a mean accuracy of 97.58% was recorded, indicating consistent and linear performance across all iterations. While the predicted epitopes show promise as vaccine candidates for SARS-CoV-2, further scientific examination through in vivo and in vitro studies is essential to validate their suitability.

Keywords