BMC Medical Informatics and Decision Making (Nov 2021)

Effect of deep learning-based assistive technology use on chest radiograph interpretation by emergency department physicians: a prospective interventional simulation-based study

  • Ji Hoon Kim,
  • Sang Gil Han,
  • Ara Cho,
  • Hye Jung Shin,
  • Song-Ee Baek

DOI
https://doi.org/10.1186/s12911-021-01679-4
Journal volume & issue
Vol. 21, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Background Interpretation of chest radiographs (CRs) by emergency department (ED) physicians is inferior to that by radiologists. Recent studies have investigated the effect of deep learning-based assistive technology on CR interpretation (DLCR), although its relevance to ED physicians remains unclear. This study aimed to investigate whether DLCR supports CR interpretation and the clinical decision-making of ED physicians. Methods We conducted a prospective interventional study using a web-based performance assessment system. Study participants were recruited through the official notice targeting board for certified emergency physicians and residents working at the present ED. Of the eight ED physicians who volunteered to participate in the study, seven ED physicians were included, while one participant declared withdrawal during performance assessment. Seven physicians’ CR interpretations and clinical decision-making were assessed based on the clinical data from 388 patients, including detecting the target lesion with DLCR. Participant performance was evaluated by area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, and accuracy analyses; decision-making consistency was measured by kappa statistics. ED physicians with < 24 months of experience were defined as ‘inexperienced’. Results Among the 388 simulated cases, 259 (66.8%) had CR abnormality. Their median value of abnormality score measured by DLCR was 59.3 (31.77, 76.25) compared to a score of 3.35 (1.57, 8.89) for cases of normal CR. There was a difference in performance between ED physicians working with and without DLCR (AUROC: 0.801, P < 0.001). The diagnostic sensitivity and accuracy of CR were higher for all ED physicians working with DLCR than for those working without it. The overall kappa value for decision-making consistency was 0.902 (95% confidence interval [CI] 0.884–0.920); concurrently, the kappa value for the experienced group was 0.956 (95% CI 0.934–0.979), and that for the inexperienced group was 0.862 (95% CI 0.835–0.889). Conclusions This study presents preliminary evidence that ED physicians using DLCR in a clinical setting perform better at CR interpretation than their counterparts who do not use this technology. DLCR use influenced the clinical decision-making of inexperienced physicians more strongly than that of experienced physicians. These findings require prospective validation before DLCR can be recommended for use in routine clinical practice.

Keywords