BMC Medical Research Methodology (Jul 2023)

Bayesian model averaging for predicting factors associated with length of COVID-19 hospitalization

  • Shabnam Bahrami,
  • Karimollah Hajian-Tilaki,
  • Masomeh Bayani,
  • Mohammad Chehrazi,
  • Zahra Mohamadi-Pirouz,
  • Abazar Amoozadeh

DOI
https://doi.org/10.1186/s12874-023-01981-x
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 17

Abstract

Read online

Abstract Introduction The length of hospital stay (LOHS) caused by COVID-19 has imposed a financial burden, and cost on the healthcare service system and a high psychological burden on patients and health workers. The purpose of this study is to adopt the Bayesian model averaging (BMA) based on linear regression models and to determine the predictors of the LOHS of COVID-19. Methods In this historical cohort study, from 5100 COVID-19 patients who had registered in the hospital database, 4996 patients were eligible to enter the study. The data included demographic, clinical, biomarkers, and LOHS. Factors affecting the LOHS were fitted in six models, including the stepwise method, AIC, BIC in classical linear regression models, two BMA using Occam's Window and Markov Chain Monte Carlo (MCMC) methods, and GBDT algorithm, a new method of machine learning. Results The average length of hospitalization was 6.7 ± 5.7 days. In fitting classical linear models, both stepwise and AIC methods (R 2 = 0.168 and adjusted R 2 = 0.165) performed better than BIC (R 2 = 0.160 and adjusted = 0.158). In fitting the BMA, Occam's Window model has performed better than MCMC with R 2 = 0.174. The GBDT method with the value of R 2 = 0.64, has performed worse than the BMA in the testing dataset but not in the training dataset. Based on the six fitted models, hospitalized in ICU, respiratory distress, age, diabetes, CRP, PO2, WBC, AST, BUN, and NLR were associated significantly with predicting LOHS of COVID-19. Conclusion The BMA with Occam's Window method has a better fit and better performance in predicting affecting factors on the LOHS in the testing dataset than other models.

Keywords