Computation (May 2022)
Investigation of Statistical Machine Learning Models for COVID-19 Epidemic Process Simulation: Random Forest, K-Nearest Neighbors, Gradient Boosting
Abstract
COVID-19 has become the largest pandemic in recent history to sweep the world. This study is devoted to developing and investigating three models of the COVID-19 epidemic process based on statistical machine learning and the evaluation of the results of their forecasting. The models developed are based on Random Forest, K-Nearest Neighbors, and Gradient Boosting methods. The models were studied for the adequacy and accuracy of predictive incidence for 3, 7, 10, 14, 21, and 30 days. The study used data on new cases of COVID-19 in Germany, Japan, South Korea, and Ukraine. These countries are selected because they have different dynamics of the COVID-19 epidemic process, and their governments have applied various control measures to contain the pandemic. The simulation results showed sufficient accuracy for practical use in the K-Nearest Neighbors and Gradient Boosting models. Public health agencies can use the models and their predictions to address various pandemic containment challenges. Such challenges are investigated depending on the duration of the constructed forecast.
Keywords