Радіоелектронні і комп'ютерні системи (Sep 2023)

Comparative analysis of the machine learning models determining COVID-19 patient risk levels

  • Kseniia Bazilevych,
  • Olena Kyrylenko,
  • Yurii Parfenyuk,
  • Serhii Krivtsov,
  • Ievgen Meniailov,
  • Victoriya Kuznietcova,
  • Dmytro Chumachenko

DOI
https://doi.org/10.32620/reks.2023.3.01
Journal volume & issue
Vol. 0, no. 3
pp. 5 – 17

Abstract

Read online

The COVID-19 pandemic has posed unprecedented challenges to global healthcare systems, emphasizing the need for predictive tools for resource allocation and patient care. This study delves into the potential of machine learning models to predict the risk levels of COVID-19 patients using a comprehensive dataset. This study aimed to evaluate and compare the efficacy of three distinct machine learning methodologies – Bayesian Criterion, Logistic Regression, and Gradient Boosting – in predicting the risk associated with COVID-19 patients based on their symptoms, status, and medical history. This research is targeted at the process of patient state determination. The research subjects are machine learning methods for patient state determination. To achieve the aim of the research, the following tasks have been formulated: methods and models of the COVID-19 patients state determination should be analyzed; classification model of the patient state determination based on Bayes criterion should be developed; classification model of the patient state determination based on logistic regression should be developed; classification model of the patient state determination based on gradient boosting should be developed; the information system should be developed; the experimental study based on machine learning methods should be provided; and the results of the experimental study should be analyzed. Methods: using a dataset provided by the Mexican government, which encompasses over a million unique patients with 21 distinct features, we developed an information system in C# programming language. This system allows users to select their preferred method for risk calculation, offering a real-time decision-making tool for healthcare professionals. Results: All models demonstrated commendable accuracy levels. However, subtle differences in their performance metrics, such as sensitivity, precision, and the F1-score, were observed. The Gradient Boosting method slightly outperformed the other models in terms of overall accuracy. Conclusions: While each model showcased its merits, the choice of method should be based on the specific needs and constraints of the healthcare system. The Gradient Boosting method emerged as marginally superior in this study. This research underscores the potential of machine learning in enhancing pandemic response strategies, offering both scientific insights and practical tools for healthcare professionals.

Keywords