F1000Research (Aug 2024)
COVID-19 Vaccine: Predicting Vaccine Types and Assessing Mortality Risk Through Ensemble Learning Algorithms [version 2; peer review: 1 approved, 2 approved with reservations, 1 not approved]
Abstract
Background There is no doubt that vaccination is crucial for preventing the spread of diseases; however, not every vaccine is perfect or will work for everyone. The main objective of this work is to predict which vaccine will be most effective for a candidate without causing severe adverse reactions and to categorize a patient as potentially at high risk of death from the COVID-19 vaccine. Methods A comprehensive analysis was conducted using a dataset on COVID-19 vaccine adverse reactions, exploring binary and multiclass classification scenarios. Ensemble models, including Random Forest, Decision Tree, Light Gradient Boosting, and extreme gradient boosting algorithm, were utilized to achieve accurate predictions. Class balancing techniques like SMOTE, TOMEK_LINK, and SMOTETOMEK were incorporated to enhance model performance. Results The study revealed that pre-existing conditions such as diabetes, hypertension, heart disease, history of allergies, prior vaccinations, other medications, age, and gender were crucial factors associated with poor outcomes. Moreover, using medical history, the ensemble learning classifiers achieved accuracy scores ranging from 75% to 87% in predicting the vaccine type and mortality possibility. The Random Forest model emerged as the best prediction model, while the implementation of the SMOTE and SMOTETOMEK methods generally improved model performance. Conclusion The random forest model emerges as the top recommendation for machine learning tasks that require high accuracy and resilience. Moreover, the findings highlight the critical role of medical history in optimizing vaccine outcomes and minimizing adverse reactions.