Scientific Reports (Sep 2021)

Deep learning for distinguishing normal versus abnormal chest radiographs and generalization to two unseen diseases tuberculosis and COVID-19

  • Zaid Nabulsi,
  • Andrew Sellergren,
  • Shahar Jamshy,
  • Charles Lau,
  • Edward Santos,
  • Atilla P. Kiraly,
  • Wenxing Ye,
  • Jie Yang,
  • Rory Pilgrim,
  • Sahar Kazemzadeh,
  • Jin Yu,
  • Sreenivasa Raju Kalidindi,
  • Mozziyar Etemadi,
  • Florencia Garcia-Vicente,
  • David Melnick,
  • Greg S. Corrado,
  • Lily Peng,
  • Krish Eswaran,
  • Daniel Tse,
  • Neeral Beladia,
  • Yun Liu,
  • Po-Hsuan Cameron Chen,
  • Shravya Shetty

DOI
https://doi.org/10.1038/s41598-021-93967-2
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Chest radiography (CXR) is the most widely-used thoracic clinical imaging modality and is crucial for guiding the management of cardiothoracic conditions. The detection of specific CXR findings has been the main focus of several artificial intelligence (AI) systems. However, the wide range of possible CXR abnormalities makes it impractical to detect every possible condition by building multiple separate systems, each of which detects one or more pre-specified conditions. In this work, we developed and evaluated an AI system to classify CXRs as normal or abnormal. For training and tuning the system, we used a de-identified dataset of 248,445 patients from a multi-city hospital network in India. To assess generalizability, we evaluated our system using 6 international datasets from India, China, and the United States. Of these datasets, 4 focused on diseases that the AI was not trained to detect: 2 datasets with tuberculosis and 2 datasets with coronavirus disease 2019. Our results suggest that the AI system trained using a large dataset containing a diverse array of CXR abnormalities generalizes to new patient populations and unseen diseases. In a simulated workflow where the AI system prioritized abnormal cases, the turnaround time for abnormal cases reduced by 7–28%. These results represent an important step towards evaluating whether AI can be safely used to flag cases in a general setting where previously unseen abnormalities exist. Lastly, to facilitate the continued development of AI models for CXR, we release our collected labels for the publicly available dataset.