Information (Dec 2021)

Predicting COVID-19 Cases in South Korea with All K-Edited Nearest Neighbors Noise Filter and Machine Learning Techniques

  • David Opeoluwa Oyewola,
  • Emmanuel Gbenga Dada,
  • Sanjay Misra,
  • Robertas Damaševičius

DOI
https://doi.org/10.3390/info12120528
Journal volume & issue
Vol. 12, no. 12
p. 528

Abstract

Read online

The application of machine learning techniques to the epidemiology of COVID-19 is a necessary measure that can be exploited to curtail the further spread of this endemic. Conventional techniques used to determine the epidemiology of COVID-19 are slow and costly, and data are scarce. We investigate the effects of noise filters on the performance of machine learning algorithms on the COVID-19 epidemiology dataset. Noise filter algorithms are used to remove noise from the datasets utilized in this study. We applied nine machine learning techniques to classify the epidemiology of COVID-19, which are bagging, boosting, support vector machine, bidirectional long short-term memory, decision tree, naïve Bayes, k-nearest neighbor, random forest, and multinomial logistic regression. Data from patients who contracted coronavirus disease were collected from the Kaggle database between 23 January 2020 and 24 June 2020. Noisy and filtered data were used in our experiments. As a result of denoising, machine learning models have produced high results for the prediction of COVID-19 cases in South Korea. For isolated cases after performing noise filtering operations, machine learning techniques achieved an accuracy between 98–100%. The results indicate that filtering noise from the dataset can improve the accuracy of COVID-19 case prediction algorithms.

Keywords