Фармакоэкономика (Oct 2021)

COVID-19 pandemic prediction model based on machine learning in selected regions of the Russian Federation

  • D. V. Gavrilov,
  • R. V. Abramov,
  • А. V. Kirilkina,
  • А. А. Ivshin,
  • R. E. Novitskiy

DOI
https://doi.org/10.17749/2070-4909/farmakoekonomika.2021.108
Journal volume & issue
Vol. 14, no. 3
pp. 342 – 356

Abstract

Read online

Background. Prediction of the new coronavirus infection (COVID-19) spread is important to take timely measures and initiate systemic preventive and anti-epidemic actions both at the regional and state levels to reduce morbidity and mortality.Objective: to develop a model for short-term forecasting of COVID-19 cases and deaths in the Russian Federation.Material and methods. The data for the model training were collected from the Stopcoronavirus.rf and Johns Hopkins University portals. It included 13 features to assess the infection dynamics and mortality, as well as the rate of morbidity and mortality in different countries and certain regions of the Russian Federation. The model was trained by the CatBoost gradient boosting method and retrained daily with updated data.Results. The forecast model of COVID-19 cases and deaths for the period of up to 14 days was created. The mean absolute percentage error (MAPE) estimate of the model’s accuracy ranged from 2.3% to 24% for 85 regions of the Russian Federation. The advantage of the CatBoost machine learning method over linear regression was shown using the example of the root mean square error (RMSE) value. The model showed less error for regions with a large population than for less populated ones.Conclusion. The model can be used not only to predict the pandemic of the novel coronavirus infection but also to control and assess the spread of diseases from the group of new infections at their emergence, peak incidence, and stabilization period.

Keywords