Diagnostics (Mar 2023)
Cross Dataset Analysis of Domain Shift in CXR Lung Region Detection
Abstract
Domain shift is one of the key challenges affecting reliability in medical imaging-based machine learning predictions. It is of significant importance to investigate this issue to gain insights into its characteristics toward determining controllable parameters to minimize its impact. In this paper, we report our efforts on studying and analyzing domain shift in lung region detection in chest radiographs. We used five chest X-ray datasets, collected from different sources, which have manual markings of lung boundaries in order to conduct extensive experiments toward this goal. We compared the characteristics of these datasets from three aspects: information obtained from metadata or an image header, image appearance, and features extracted from a pretrained model. We carried out experiments to evaluate and compare model performances within each dataset and across datasets in four scenarios using different combinations of datasets. We proposed a new feature visualization method to provide explanations for the applied object detection network on the obtained quantitative results. We also examined chest X-ray modality-specific initialization, catastrophic forgetting, and model repeatability. We believe the observations and discussions presented in this work could help to shed some light on the importance of the analysis of training data for medical imaging machine learning research, and could provide valuable guidance for domain shift analysis.
Keywords