Nature Communications (Sep 2024)

An international study presenting a federated learning AI platform for pediatric brain tumors

  • Edward H. Lee,
  • Michelle Han,
  • Jason Wright,
  • Michael Kuwabara,
  • Jacob Mevorach,
  • Gang Fu,
  • Olivia Choudhury,
  • Ujjwal Ratan,
  • Michael Zhang,
  • Matthias W. Wagner,
  • Robert Goetti,
  • Sebastian Toescu,
  • Sebastien Perreault,
  • Hakan Dogan,
  • Emre Altinmakas,
  • Maryam Mohammadzadeh,
  • Kathryn A. Szymanski,
  • Cynthia J. Campen,
  • Hollie Lai,
  • Azam Eghbal,
  • Alireza Radmanesh,
  • Kshitij Mankad,
  • Kristian Aquilina,
  • Mourad Said,
  • Arastoo Vossough,
  • Ozgur Oztekin,
  • Birgit Ertl-Wagner,
  • Tina Poussaint,
  • Eric M. Thompson,
  • Chang Y. Ho,
  • Alok Jaju,
  • John Curran,
  • Vijay Ramaswamy,
  • Samuel H. Cheshier,
  • Gerald A. Grant,
  • S. Simon Wong,
  • Michael E. Moseley,
  • Robert M. Lober,
  • Mattias Wilms,
  • Nils D. Forkert,
  • Nicholas A. Vitanza,
  • Jeffrey H. Miller,
  • Laura M. Prolo,
  • Kristen W. Yeom

DOI
https://doi.org/10.1038/s41467-024-51172-5
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 11

Abstract

Read online

Abstract While multiple factors impact disease, artificial intelligence (AI) studies in medicine often use small, non-diverse patient cohorts due to data sharing and privacy issues. Federated learning (FL) has emerged as a solution, enabling training across hospitals without direct data sharing. Here, we present FL-PedBrain, an FL platform for pediatric posterior fossa brain tumors, and evaluate its performance on a diverse, realistic, multi-center cohort. Pediatric brain tumors were targeted due to the scarcity of such datasets, even in tertiary care hospitals. Our platform orchestrates federated training for joint tumor classification and segmentation across 19 international sites. FL-PedBrain exhibits less than a 1.5% decrease in classification and a 3% reduction in segmentation performance compared to centralized data training. FL boosts segmentation performance by 20 to 30% on three external, out-of-network sites. Finally, we explore the sources of data heterogeneity and examine FL robustness in real-world scenarios with data imbalances.