Alexandria Engineering Journal (Sep 2023)

Bayesian machine learning analysis with Markov Chain Monte Carlo techniques for assessing characteristics and risk factors of Covid-19 in Erbil City-Iraq 2020–2021

  • Hewir Abdulqadir Khidir,
  • İlker Etikan,
  • Dler Hussein Kadir,
  • Nozad H. Mahmood,
  • R. Sabetvand

Journal volume & issue
Vol. 78
pp. 162 – 174

Abstract

Read online

The study aims to showcase machine learning techniques in the application of medical datasets for improving identification of correlations and relationships between variables, which will lead to more informed decision-making. Unlike other studies, intensive statistical modelling is used to understand and find the effective of variables cause to lead death due to Covid-19. Due to large dataset, not common approaches derive us to ideal conclusion. Furthermore, Bayesian technique is applied to generate predictive posterior distributions of the unknown parameters in the model in neural network as well as logistic regression, which helps us to avoid overfitting in machine learning applications and have additional measurements in assessing fitted model performance. According to the results extracted from the statistical analysis, the Bayesian neural network demonstrated superior performance in terms of classification measurements such as AUC (84.66%), F1 (87.11%), Precision (88.29%), and Recall (85.96%). The Bayesian logistic regression also performed well, but with slightly lower scores, achieving AUC (83.07%), F1 (85.59%), Precision (84.55%), and Recall (85.59%). In contrast, logistic regression (MLE) technique had the worst performance with very low scores (AUC = 52.38%, F1 = 57.55%, Precision = 57.01%, Recall = 58.10%). Regarding the variables' association with mortality, stepwise forward selection helped to identify seven significant variables. Age was found to be the most significant variable in predicting the probability of dying, with patients in the age group of (18–44) having 12 times higher odds, patients in the age group of (45–64) having 123 more odds, and patients above 65 years old having 436 times more chance to die compared to patients below 18 years old. Severe coughing was also significant with 7.26 odds, and patients suffering from diabetes had 2.82 times more chance to die. Moreover, SpO2 contributed to a decrease of 20% in the relative risk of dying from Covid-19 disease. Gender and Smoking did not show a significant association with mortality. Finally, the Bayesian approach showed higher sensitivity and specificity than the classic neural network.

Keywords