Healthcare (Jan 2022)

Diagnostic Performance of a Deep Learning Model Deployed at a National COVID-19 Screening Facility for Detection of Pneumonia on Frontal Chest Radiographs

  • Jordan Z. T. Sim,
  • Yong-Han Ting,
  • Yuan Tang,
  • Yangqin Feng,
  • Xiaofeng Lei,
  • Xiaohong Wang,
  • Wen-Xiang Chen,
  • Su Huang,
  • Sum-Thai Wong,
  • Zhongkang Lu,
  • Yingnan Cui,
  • Soo-Kng Teo,
  • Xin-Xing Xu,
  • Wei-Min Huang,
  • Cher-Heng Tan

DOI
https://doi.org/10.3390/healthcare10010175
Journal volume & issue
Vol. 10, no. 1
p. 175

Abstract

Read online

(1) Background: Chest radiographs are the mainstay of initial radiological investigation in this COVID-19 pandemic. A reliable and readily deployable artificial intelligence (AI) algorithm that detects pneumonia in COVID-19 suspects can be useful for screening or triage in a hospital setting. This study has a few objectives: first, to develop a model that accurately detects pneumonia in COVID-19 suspects; second, to assess its performance in a real-world clinical setting; and third, by integrating the model with the daily clinical workflow, to measure its impact on report turn-around time. (2) Methods: The model was developed from the NIH Chest-14 open-source dataset and fine-tuned using an internal dataset comprising more than 4000 CXRs acquired in our institution. Input from two senior radiologists provided the reference standard. The model was integrated into daily clinical workflow, prioritising abnormal CXRs for expedited reporting. Area under the receiver operating characteristic curve (AUC), F1 score, sensitivity, and specificity were calculated to characterise diagnostic performance. The average time taken by radiologists in reporting the CXRs was compared against the mean baseline time taken prior to implementation of the AI model. (3) Results: 9431 unique CXRs were included in the datasets, of which 1232 were ground truth-labelled positive for pneumonia. On the “live” dataset, the model achieved an AUC of 0.95 (95% confidence interval (CI): 0.92, 0.96) corresponding to a specificity of 97% (95% CI: 0.97, 0.98) and sensitivity of 79% (95% CI: 0.72, 0.84). No statistically significant degradation of diagnostic performance was encountered during clinical deployment, and report turn-around time was reduced by 22%. (4) Conclusion: In real-world clinical deployment, our model expedites reporting of pneumonia in COVID-19 suspects while preserving diagnostic performance without significant model drift.

Keywords