Scientific Reports (May 2021)

COVID-19 diagnosis by routine blood tests using machine learning

  • Matjaž Kukar,
  • Gregor Gunčar,
  • Tomaž Vovko,
  • Simon Podnar,
  • Peter Černelč,
  • Miran Brvar,
  • Mateja Zalaznik,
  • Mateja Notar,
  • Sašo Moškon,
  • Marko Notar

DOI
https://doi.org/10.1038/s41598-021-90265-9
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Physicians taking care of patients with COVID-19 have described different changes in routine blood parameters. However, these changes hinder them from performing COVID-19 diagnoses. We constructed a machine learning model for COVID-19 diagnosis that was based and cross-validated on the routine blood tests of 5333 patients with various bacterial and viral infections, and 160 COVID-19-positive patients. We selected the operational ROC point at a sensitivity of 81.9% and a specificity of 97.9%. The cross-validated AUC was 0.97. The five most useful routine blood parameters for COVID-19 diagnosis according to the feature importance scoring of the XGBoost algorithm were: MCHC, eosinophil count, albumin, INR, and prothrombin activity percentage. t-SNE visualization showed that the blood parameters of the patients with a severe COVID-19 course are more like the parameters of a bacterial than a viral infection. The reported diagnostic accuracy is at least comparable and probably complementary to RT-PCR and chest CT studies. Patients with fever, cough, myalgia, and other symptoms can now have initial routine blood tests assessed by our diagnostic tool. All patients with a positive COVID-19 prediction would then undergo standard RT-PCR studies to confirm the diagnosis. We believe that our results represent a significant contribution to improvements in COVID-19 diagnosis.