PLoS ONE (Jan 2022)

Novel extreme regression-voting classifier to predict death risk in vaccinated people using VAERS data.

  • Eysha Saad,
  • Saima Sadiq,
  • Ramish Jamil,
  • Furqan Rustam,
  • Arif Mehmood,
  • Gyu Sang Choi,
  • Imran Ashraf

DOI
https://doi.org/10.1371/journal.pone.0270327
Journal volume & issue
Vol. 17, no. 6
p. e0270327

Abstract

Read online

COVID-19 vaccination raised serious concerns among the public and people are mind stuck by various rumors regarding the resulting illness, adverse reactions, and death. Such rumors are dangerous to the campaign against the COVID-19 and should be dealt with accordingly and timely. One prospective solution is to use machine learning-based models to predict the death risk for vaccinated people and clarify people's perceptions regarding death risk. This study focuses on the prediction of the death risks associated with vaccinated people followed by a second dose for two reasons; first to build consensus among people to get the vaccines; second, to reduce the fear regarding vaccines. Given that, this study utilizes the COVID-19 VAERS dataset that records adverse events after COVID-19 vaccination as 'recovered', 'not recovered', and 'survived'. To obtain better prediction results, a novel voting classifier extreme regression-voting classifier (ER-VC) is introduced. ER-VC ensembles extra tree classifier and logistic regression using soft voting criterion. To avoid model overfitting and get better results, two data balancing techniques synthetic minority oversampling (SMOTE) and adaptive synthetic sampling (ADASYN) have been applied. Moreover, three feature extraction techniques term frequency-inverse document frequency (TF-IDF), bag of words (BoW), and global vectors (GloVe) have been used for comparison. Both machine learning and deep learning models are deployed for experiments. Results obtained from extensive experiments reveal that the proposed model in combination with TF-TDF has shown robust results with a 0.85 accuracy when trained on the SMOTE-balanced dataset. In line with this, validation of the proposed voting classifier on binary classification shows state-of-the-art results with a 0.98 accuracy. Results show that machine learning models can predict the death risk with high accuracy and can assist the authors in taking timely measures.