Enhancing domain generalization in the AI-based analysis of chest radiographs with federated learning

Soroosh Tayebi Arasteh; Christiane Kuhl; Marwin-Jonathan Saehn; Peter Isfort; Daniel Truhn; Sven Nebelung

doi:10.1038/s41598-023-49956-8

Scientific Reports (Dec 2023)

Enhancing domain generalization in the AI-based analysis of chest radiographs with federated learning

Soroosh Tayebi Arasteh,
Christiane Kuhl,
Marwin-Jonathan Saehn,
Peter Isfort,
Daniel Truhn,
Sven Nebelung

Affiliations

Soroosh Tayebi Arasteh: Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen
Christiane Kuhl: Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen
Marwin-Jonathan Saehn: Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen
Peter Isfort: Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen
Daniel Truhn: Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen
Sven Nebelung: Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen

DOI: https://doi.org/10.1038/s41598-023-49956-8
Journal volume & issue: Vol. 13, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Developing robust artificial intelligence (AI) models that generalize well to unseen datasets is challenging and usually requires large and variable datasets, preferably from multiple institutions. In federated learning (FL), a model is trained collaboratively at numerous sites that hold local datasets without exchanging them. So far, the impact of training strategy, i.e., local versus collaborative, on the diagnostic on-domain and off-domain performance of AI models interpreting chest radiographs has not been assessed. Consequently, using 610,000 chest radiographs from five institutions across the globe, we assessed diagnostic performance as a function of training strategy (i.e., local vs. collaborative), network architecture (i.e., convolutional vs. transformer-based), single versus cross-institutional performance (i.e., on-domain vs. off-domain), imaging finding (i.e., cardiomegaly, pleural effusion, pneumonia, atelectasis, consolidation, pneumothorax, and no abnormality), dataset size (i.e., from n = 18,000 to 213,921 radiographs), and dataset diversity. Large datasets not only showed minimal performance gains with FL but, in some instances, even exhibited decreases. In contrast, smaller datasets revealed marked improvements. Thus, on-domain performance was mainly driven by training data size. However, off-domain performance leaned more on training diversity. When trained collaboratively across diverse external institutions, AI models consistently surpassed models trained locally for off-domain tasks, emphasizing FL’s potential in leveraging data diversity. In conclusion, FL can bolster diagnostic privacy, reproducibility, and off-domain reliability of AI models and, potentially, optimize healthcare outcomes.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal