Nature Communications (Jul 2025)

Towards fair decentralized benchmarking of healthcare AI algorithms with the Federated Tumor Segmentation (FeTS) challenge

  • Maximilian Zenk,
  • Ujjwal Baid,
  • Sarthak Pati,
  • Akis Linardos,
  • Brandon Edwards,
  • Micah Sheller,
  • Patrick Foley,
  • Alejandro Aristizabal,
  • David Zimmerer,
  • Alexey Gruzdev,
  • Jason Martin,
  • Russell T. Shinohara,
  • Annika Reinke,
  • Fabian Isensee,
  • Santhosh Parampottupadam,
  • Kaushal Parekh,
  • Ralf Floca,
  • Hasan Kassem,
  • Bhakti Baheti,
  • Siddhesh Thakur,
  • Verena Chung,
  • Kaisar Kushibar,
  • Karim Lekadir,
  • Meirui Jiang,
  • Youtan Yin,
  • Hongzheng Yang,
  • Quande Liu,
  • Cheng Chen,
  • Qi Dou,
  • Pheng-Ann Heng,
  • Xiaofan Zhang,
  • Shaoting Zhang,
  • Muhammad Irfan Khan,
  • Mohammad Ayyaz Azeem,
  • Mojtaba Jafaritadi,
  • Esa Alhoniemi,
  • Elina Kontio,
  • Suleiman A. Khan,
  • Leon Mächler,
  • Ivan Ezhov,
  • Florian Kofler,
  • Suprosanna Shit,
  • Johannes C. Paetzold,
  • Timo Loehr,
  • Benedikt Wiestler,
  • Himashi Peiris,
  • Kamlesh Pawar,
  • Shenjun Zhong,
  • Zhaolin Chen,
  • Munawar Hayat,
  • Gary Egan,
  • Mehrtash Harandi,
  • Ece Isik Polat,
  • Gorkem Polat,
  • Altan Kocyigit,
  • Alptekin Temizel,
  • Anup Tuladhar,
  • Lakshay Tyagi,
  • Raissa Souza,
  • Nils D. Forkert,
  • Pauline Mouches,
  • Matthias Wilms,
  • Vishruth Shambhat,
  • Akansh Maurya,
  • Shubham Subhas Danannavar,
  • Rohit Kalla,
  • Vikas Kumar Anand,
  • Ganapathy Krishnamurthi,
  • Sahil Nalawade,
  • Chandan Ganesh,
  • Ben Wagner,
  • Divya Reddy,
  • Yudhajit Das,
  • Fang F. Yu,
  • Baowei Fei,
  • Ananth J. Madhuranthakam,
  • Joseph Maldjian,
  • Gaurav Singh,
  • Jianxun Ren,
  • Wei Zhang,
  • Ning An,
  • Qingyu Hu,
  • Youjia Zhang,
  • Ying Zhou,
  • Vasilis Siomos,
  • Giacomo Tarroni,
  • Jonathan Passerrat-Palmbach,
  • Ambrish Rawat,
  • Giulio Zizzo,
  • Swanand Ravindra Kadhe,
  • Jonathan P. Epperlein,
  • Stefano Braghin,
  • Yuan Wang,
  • Renuga Kanagavelu,
  • Qingsong Wei,
  • Yechao Yang,
  • Yong Liu,
  • Krzysztof Kotowski,
  • Szymon Adamski,
  • Bartosz Machura,
  • Wojciech Malara,
  • Lukasz Zarudzki,
  • Jakub Nalepa,
  • Yaying Shi,
  • Hongjian Gao,
  • Salman Avestimehr,
  • Yonghong Yan,
  • Agus S. Akbar,
  • Ekaterina Kondrateva,
  • Hua Yang,
  • Zhaopei Li,
  • Hung-Yu Wu,
  • Johannes Roth,
  • Camillo Saueressig,
  • Alexandre Milesi,
  • Quoc D. Nguyen,
  • Nathan J. Gruenhagen,
  • Tsung-Ming Huang,
  • Jun Ma,
  • Har Shwinder H. Singh,
  • Nai-Yu Pan,
  • Dingwen Zhang,
  • Ramy A. Zeineldin,
  • Michal Futrega,
  • Yading Yuan,
  • Gian Marco Conte,
  • Xue Feng,
  • Quan D. Pham,
  • Yong Xia,
  • Zhifan Jiang,
  • Huan Minh Luu,
  • Mariia Dobko,
  • Alexandre Carré,
  • Bair Tuchinov,
  • Hassan Mohy-ud-Din,
  • Saruar Alam,
  • Anup Singh,
  • Nameeta Shah,
  • Weichung Wang,
  • Chiharu Sako,
  • Michel Bilello,
  • Satyam Ghodasara,
  • Suyash Mohan,
  • Christos Davatzikos,
  • Evan Calabrese,
  • Jeffrey Rudie,
  • Javier Villanueva-Meyer,
  • Soonmee Cha,
  • Christopher Hess,
  • John Mongan,
  • Madhura Ingalhalikar,
  • Manali Jadhav,
  • Umang Pandey,
  • Jitender Saini,
  • Raymond Y. Huang,
  • Ken Chang,
  • Minh-Son To,
  • Sargam Bhardwaj,
  • Chee Chong,
  • Marc Agzarian,
  • Michal Kozubek,
  • Filip Lux,
  • Jan Michálek,
  • Petr Matula,
  • Miloš Ker^kovský,
  • Tereza Kopr^ivová,
  • Marek Dostál,
  • Václav Vybíhal,
  • Marco C. Pinho,
  • James Holcomb,
  • Marie Metz,
  • Rajan Jain,
  • Matthew D. Lee,
  • Yvonne W. Lui,
  • Pallavi Tiwari,
  • Ruchika Verma,
  • Rohan Bareja,
  • Ipsa Yadav,
  • Jonathan Chen,
  • Neeraj Kumar,
  • Yuriy Gusev,
  • Krithika Bhuvaneshwar,
  • Anousheh Sayah,
  • Camelia Bencheqroun,
  • Anas Belouali,
  • Subha Madhavan,
  • Rivka R. Colen,
  • Aikaterini Kotrotsou,
  • Philipp Vollmuth,
  • Gianluca Brugnara,
  • Chandrakanth J. Preetha,
  • Felix Sahm,
  • Martin Bendszus,
  • Wolfgang Wick,
  • Abhishek Mahajan,
  • Carmen Balaña,
  • Jaume Capellades,
  • Josep Puig,
  • Yoon Seong Choi,
  • Seung-Koo Lee,
  • Jong Hee Chang,
  • Sung Soo Ahn,
  • Hassan F. Shaykh,
  • Alejandro Herrera-Trujillo,
  • Maria Trujillo,
  • William Escobar,
  • Ana Abello,
  • Jose Bernal,
  • Jhon Gómez,
  • Pamela LaMontagne,
  • Daniel S. Marcus,
  • Mikhail Milchenko,
  • Arash Nazeri,
  • Bennett Landman,
  • Karthik Ramadass,
  • Kaiwen Xu,
  • Silky Chotai,
  • Lola B. Chambless,
  • Akshitkumar Mistry,
  • Reid C. Thompson,
  • Ashok Srinivasan,
  • J. Rajiv Bapuraj,
  • Arvind Rao,
  • Nicholas Wang,
  • Ota Yoshiaki,
  • Toshio Moritani,
  • Sevcan Turk,
  • Joonsang Lee,
  • Snehal Prabhudesai,
  • John Garrett,
  • Matthew Larson,
  • Robert Jeraj,
  • Hongwei Li,
  • Tobias Weiss,
  • Michael Weller,
  • Andrea Bink,
  • Bertrand Pouymayou,
  • Sonam Sharma,
  • Tzu-Chi Tseng,
  • Saba Adabi,
  • Alexandre Xavier Falcão,
  • Samuel B. Martins,
  • Bernardo C. A. Teixeira,
  • Flávia Sprenger,
  • David Menotti,
  • Diego R. Lucio,
  • Simone P. Niclou,
  • Olivier Keunen,
  • Ann-Christin Hau,
  • Enrique Pelaez,
  • Heydy Franco-Maldonado,
  • Francis Loayza,
  • Sebastian Quevedo,
  • Richard McKinley,
  • Johannes Slotboom,
  • Piotr Radojewski,
  • Raphael Meier,
  • Roland Wiest,
  • Johannes Trenkler,
  • Josef Pichler,
  • Georg Necker,
  • Andreas Haunschmidt,
  • Stephan Meckel,
  • Pamela Guevara,
  • Esteban Torche,
  • Cristobal Mendoza,
  • Franco Vera,
  • Elvis Ríos,
  • Eduardo López,
  • Sergio A. Velastin,
  • Joseph Choi,
  • Stephen Baek,
  • Yusung Kim,
  • Heba Ismael,
  • Bryan Allen,
  • John M. Buatti,
  • Peter Zampakis,
  • Vasileios Panagiotopoulos,
  • Panagiotis Tsiganos,
  • Sotiris Alexiou,
  • Ilias Haliassos,
  • Evangelia I. Zacharaki,
  • Konstantinos Moustakas,
  • Christina Kalogeropoulou,
  • Dimitrios M. Kardamakis,
  • Bing Luo,
  • Laila M. Poisson,
  • Ning Wen,
  • Martin Vallières,
  • Mahdi Ait Lhaj Loutfi,
  • David Fortin,
  • Martin Lepage,
  • Fanny Morón,
  • Jacob Mandel,
  • Gaurav Shukla,
  • Spencer Liem,
  • Gregory S. Alexandre,
  • Joseph Lombardo,
  • Joshua D. Palmer,
  • Adam E. Flanders,
  • Adam P. Dicker,
  • Godwin Ogbole,
  • Dotun Oyekunle,
  • Olubunmi Odafe-Oyibotha,
  • Babatunde Osobu,
  • Mustapha Shu’aibu Hikima,
  • Mayowa Soneye,
  • Farouk Dako,
  • Adeleye Dorcas,
  • Derrick Murcia,
  • Eric Fu,
  • Rourke Haas,
  • John A. Thompson,
  • David Ryan Ormond,
  • Stuart Currie,
  • Kavi Fatania,
  • Russell Frood,
  • Amber L. Simpson,
  • Jacob J. Peoples,
  • Ricky Hu,
  • Danielle Cutler,
  • Fabio Y. Moraes,
  • Anh Tran,
  • Mohammad Hamghalam,
  • Michael A. Boss,
  • James Gimpel,
  • Deepak Kattil Veettil,
  • Kendall Schmidt,
  • Lisa Cimino,
  • Cynthia Price,
  • Brian Bialecki,
  • Sailaja Marella,
  • Charles Apgar,
  • Andras Jakab,
  • Marc-André Weber,
  • Errol Colak,
  • Jens Kleesiek,
  • John B. Freymann,
  • Justin S. Kirby,
  • Lena Maier-Hein,
  • Jake Albrecht,
  • Peter Mattson,
  • Alexandros Karargyris,
  • Prashant Shah,
  • Bjoern Menze,
  • Klaus Maier-Hein,
  • Spyridon Bakas

DOI
https://doi.org/10.1038/s41467-025-60466-1
Journal volume & issue
Vol. 16, no. 1
pp. 1 – 20

Abstract

Read online

Abstract Computational competitions are the standard for benchmarking medical image analysis algorithms, but they typically use small curated test datasets acquired at a few centers, leaving a gap to the reality of diverse multicentric patient data. To this end, the Federated Tumor Segmentation (FeTS) Challenge represents the paradigm for real-world algorithmic performance evaluation. The FeTS challenge is a competition to benchmark (i) federated learning aggregation algorithms and (ii) state-of-the-art segmentation algorithms, across multiple international sites. Weight aggregation and client selection techniques were compared using a multicentric brain tumor dataset in realistic federated learning simulations, yielding benefits for adaptive weight aggregation, and efficiency gains through client sampling. Quantitative performance evaluation of state-of-the-art segmentation algorithms on data distributed internationally across 32 institutions yielded good generalization on average, albeit the worst-case performance revealed data-specific modes of failure. Similar multi-site setups can help validate the real-world utility of healthcare AI algorithms in the future.