PLoS ONE (Jan 2024)

Big data analysis for Covid-19 in hospital information systems.

  • Xinpa Ying,
  • Haiyang Peng,
  • Jun Xie

DOI
https://doi.org/10.1371/journal.pone.0294481
Journal volume & issue
Vol. 19, no. 5
p. e0294481

Abstract

Read online

The COVID-19 pandemic has triggered a global public health crisis, affecting hundreds of countries. With the increasing number of infected cases, developing automated COVID-19 identification tools based on CT images can effectively assist clinical diagnosis and reduce the tedious workload of image interpretation. To expand the dataset for machine learning methods, it is necessary to aggregate cases from different medical systems to learn robust and generalizable models. This paper proposes a novel deep learning joint framework that can effectively handle heterogeneous datasets with distribution discrepancies for accurate COVID-19 identification. We address the cross-site domain shift by redesigning the COVID-Net's network architecture and learning strategy, and independent feature normalization in latent space to improve prediction accuracy and learning efficiency. Additionally, we propose using a contrastive training objective to enhance the domain invariance of semantic embeddings and boost classification performance on each dataset. We develop and evaluate our method with two large-scale public COVID-19 diagnosis datasets containing CT images. Extensive experiments show that our method consistently improves the performance both datasets, outperforming the original COVID-Net trained on each dataset by 13.27% and 15.15% in AUC respectively, also exceeding existing state-of-the-art multi-site learning methods.