BMC Medical Imaging (May 2022)

Deep transfer learning to quantify pleural effusion severity in chest X-rays

  • Tao Huang,
  • Rui Yang,
  • Longbin Shen,
  • Aozi Feng,
  • Li Li,
  • Ningxia He,
  • Shuna Li,
  • Liying Huang,
  • Jun Lyu

DOI
https://doi.org/10.1186/s12880-022-00827-0
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Purpose The detection of pleural effusion in chest radiography is crucial for doctors to make timely treatment decisions for patients with chronic obstructive pulmonary disease. We used the MIMIC-CXR database to develop a deep learning model to quantify pleural effusion severity in chest radiographs. Methods The Medical Information Mart for Intensive Care Chest X-ray (MIMIC-CXR) dataset was divided into patients ‘with’ or ‘without’ chronic obstructive pulmonary disease (COPD). The label of pleural effusion severity was obtained from the extracted COPD radiology reports and classified into four categories: no effusion, small effusion, moderate effusion, and large effusion. A total of 200 datasets were randomly sampled to manually check each item and determine whether the tags are correct. A professional doctor re-tagged these items as a verification cohort without knowing their previous tags. The learning models include eight common network structures including Resnet, DenseNet, and GoogleNET. Three data processing methods (no sampling, downsampling, and upsampling) and two loss algorithms (focal loss and cross-entropy loss) were used for unbalanced data. The Neural Network Intelligence tool was applied to train the model. Receiver operating characteristic curves, Area under the curve, and confusion matrix were employed to evaluate the model results. Grad-CAM was used for model interpretation. Results Among the 8533 patients, 15,620 chest X-rays with clearly marked pleural effusion severity were obtained (no effusion, 5685; small effusion, 4877; moderate effusion, 3657; and large effusion, 1401). The error rate of the manual check label was 6.5%, and the error rate of the doctor’s relabeling was 11.0%. The highest accuracy rate of the optimized model was 73.07. The micro-average AUCs of the testing and validation cohorts was 0.89 and 0.90, respectively, and their macro-average AUCs were 0.86 and 0.89, respectively. The AUC of the distinguishing results of each class and the other three classes were 0.95 and 0.94, 0.76 and 0.83, 0.85 and 0.83, and 0.87 and 0.93. Conclusion The deep transfer learning model can grade the severity of pleural effusion.

Keywords