pAtbP-EnC: Identifying Anti-Tubercular Peptides Using Multi-Feature Representation and Genetic Algorithm-Based Deep Ensemble Model

Shahid Akbar; Ali Raza; Tamara Al Shloul; Ashfaq Ahmad; Aamir Saeed; Yazeed Yasin Ghadi; Orken Mamyrbayev; Elsayed Tag-Eldin

doi:10.1109/ACCESS.2023.3321100

IEEE Access (Jan 2023)

pAtbP-EnC: Identifying Anti-Tubercular Peptides Using Multi-Feature Representation and Genetic Algorithm-Based Deep Ensemble Model

Shahid Akbar,
Ali Raza,
Tamara Al Shloul,
Ashfaq Ahmad,
Aamir Saeed,
Yazeed Yasin Ghadi,
Orken Mamyrbayev,
Elsayed Tag-Eldin

Affiliations

Shahid Akbar: ORCiD; Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan
Ali Raza: Department of Computer Science, MY University Islamabad, Islamabad, Pakistan
Tamara Al Shloul: Department of General Education, Liwa College of Technology, Abu Dhabi, United Arab Emirates
Ashfaq Ahmad: Department of Computer Science, MY University Islamabad, Islamabad, Pakistan
Aamir Saeed: ORCiD; Department of Computer Science and IT, University of Engineering and Technology, Peshawar, Pakistan
Yazeed Yasin Ghadi: ORCiD; Department of Computer Science, Al Ain University, Abu Dhabi, United Arab Emirates
Orken Mamyrbayev: ORCiD; Institute of Information and Computational Technologies, Almaty, Kazakhstan
Elsayed Tag-Eldin: ORCiD; Faculty of Engineering and Technology, Future University in Egypt, New Cairo, Egypt

DOI: https://doi.org/10.1109/ACCESS.2023.3321100
Journal volume & issue: Vol. 11
pp. 137099 – 137114

Abstract

Read online

Mycobacterium tuberculosis, a highly perilous pathogen in humans, serves as the causative agent of tuberculosis (TB), affecting nearly 33% of the global population. With the increasing prevalence of multidrug-resistant TB, there is a need for novel and efficacious alternative therapies. Peptide therapies have emerged as a favorable alternative due to their remarkable specificity in targeting cells without affecting healthy cells. However, the experimental identification methods of anti-tubercular peptides (AtbPs) are labor-intensive and costly. Therefore, accurate prediction of AtbPs has become challenging due to the large number of peptide samples. In this paper, we propose an ensemble learning model to enhance the prediction outcomes by addressing the limitations of individual learning models. We formulate the training samples by utilizing four distinct representation methods: AAindex, Composition/Transition/Distribution, Dipeptide Deviation from Expected Mean, and Enhanced Grouped Amino Acid Composition to numerically encode peptide samples. The feature vectors extracted from these methods are fused to develop a compact vector. We evaluate the prediction rates using three different classification models, employing both individual and heterogeneous vectors. Furthermore, we enhance the prediction and training capabilities of the proposed model by using the predicted labels of the individual classifiers for implementing an ensemble deep model via a genetic algorithm. Through evaluation of both the training datasets and independent datasets, our proposed ensemble learner achieves impressive accuracies of 97.80%, 95.13%, 93.91%, and 94.17%, using RD training, MD training, RD independent, and MD independent datasets, respectively. Our findings demonstrate that the proposed pAtbP-EnC model outperforms existing predictors by reporting approximately 11% higher training accuracy. We conclude that the pAtbP-EnC predictor will be a considerable tool in the field of pharmaceutical design and research academia. The used datasets and the source code are publicly available at https://github.com/Intelligent-models/pAtbP-EnC2023.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords