BMC Infectious Diseases (Dec 2023)

Association between biochemical and hematologic factors with COVID-19 using data mining methods

  • Amin Mansoori,
  • Nafiseh Hosseini,
  • Hamideh Ghazizadeh,
  • Malihe Aghasizadeh,
  • Susan Drroudi,
  • Toktam Sahranavard,
  • Hanie Salmani Izadi,
  • Amirhossein Amiriani,
  • Ehsan Mosa Farkhani,
  • Gordon A. Ferns,
  • Majid Ghayour-Mobarhan,
  • Mohsen Moohebati,
  • Habibollah Esmaily

DOI
https://doi.org/10.1186/s12879-023-08676-0
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background and aim Coronavirus disease (COVID-19) is an infectious disease that can spread very rapidly with important public health impacts. The prediction of the important factors related to the patient's infectious diseases is helpful to health care workers. The aim of this research was to select the critical feature of the relationship between demographic, biochemical, and hematological characteristics, in patients with and without COVID-19 infection. Method A total of 13,170 participants in the age range of 35–65 years were recruited. Decision Tree (DT), Logistic Regression (LR), and Bootstrap Forest (BF) techniques were fitted into data. Three models were considered in this study, in model I, the biochemical features, in model II, the hematological features, and in model II, both biochemical and homological features were studied. Results In Model I, the BF, DT, and LR algorithms identified creatine phosphokinase (CPK), blood urea nitrogen (BUN), fasting blood glucose (FBG), total bilirubin, body mass index (BMI), sex, and age, as important predictors for COVID-19. In Model II, our BF, DT, and LR algorithms identified BMI, sex, mean platelet volume (MPV), and age as important predictors. In Model III, our BF, DT, and LR algorithms identified CPK, BMI, MPV, BUN, FBG, sex, creatinine (Cr), age, and total bilirubin as important predictors. Conclusion The proposed BF, DT, and LR models appear to be able to predict and classify infected and non-infected people based on CPK, BUN, BMI, MPV, FBG, Sex, Cr, and Age which had a high association with COVID-19.

Keywords