Scientific Reports (May 2022)

Deep learning model for the automatic classification of COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy: a multi-center retrospective study

  • Mizuho Nishio,
  • Daigo Kobayashi,
  • Eiko Nishioka,
  • Hidetoshi Matsuo,
  • Yasuyo Urase,
  • Koji Onoue,
  • Reiichi Ishikura,
  • Yuri Kitamura,
  • Eiro Sakai,
  • Masaru Tomita,
  • Akihiro Hamanaka,
  • Takamichi Murakami

DOI
https://doi.org/10.1038/s41598-022-11990-3
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 10

Abstract

Read online

Abstract This retrospective study aimed to develop and validate a deep learning model for the classification of coronavirus disease-2019 (COVID-19) pneumonia, non-COVID-19 pneumonia, and the healthy using chest X-ray (CXR) images. One private and two public datasets of CXR images were included. The private dataset included CXR from six hospitals. A total of 14,258 and 11,253 CXR images were included in the 2 public datasets and 455 in the private dataset. A deep learning model based on EfficientNet with noisy student was constructed using the three datasets. The test set of 150 CXR images in the private dataset were evaluated by the deep learning model and six radiologists. Three-category classification accuracy and class-wise area under the curve (AUC) for each of the COVID-19 pneumonia, non-COVID-19 pneumonia, and healthy were calculated. Consensus of the six radiologists was used for calculating class-wise AUC. The three-category classification accuracy of our model was 0.8667, and those of the six radiologists ranged from 0.5667 to 0.7733. For our model and the consensus of the six radiologists, the class-wise AUC of the healthy, non-COVID-19 pneumonia, and COVID-19 pneumonia were 0.9912, 0.9492, and 0.9752 and 0.9656, 0.8654, and 0.8740, respectively. Difference of the class-wise AUC between our model and the consensus of the six radiologists was statistically significant for COVID-19 pneumonia (p value = 0.001334). Thus, an accurate model of deep learning for the three-category classification could be constructed; the diagnostic performance of our model was significantly better than that of the consensus interpretation by the six radiologists for COVID-19 pneumonia.