Scientific Reports (Sep 2024)

Federated influencer learning for secure and efficient collaborative learning in realistic medical database environment

  • Haengbok Chung,
  • Jae Sung Lee

DOI
https://doi.org/10.1038/s41598-024-73863-1
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Enhancing deep learning performance requires extensive datasets. Centralized training raises concerns about data ownership and security. Additionally, large models are often unsuitable for hospitals due to their limited resource capacities. Federated learning (FL) has been introduced to address these issues. However, FL faces challenges such as vulnerability to attacks, non-IID data, reliance on a central server, high communication overhead, and suboptimal model aggregation. Furthermore, FL is not optimized for realistic hospital database environments, where data are dynamically accumulated. To overcome these limitations, we propose federated influencer learning (FIL) as a secure and efficient collaborative learning paradigm. Unlike the server-client model of FL, FIL features an equal-status structure among participants, with an administrator overseeing the overall process. FIL comprises four stages: local training, qualification, screening, and influencing. Local training is similar to vanilla FL, except for the optional use of a shared dataset. In the qualification stage, participants are classified as influencers or followers. During the screening stage, the integrity of the logits from the influencer is examined. If the integrity is confirmed, the influencer shares their knowledge with the others. FIL is more secure than FL because it eliminates the need for model-parameter transactions, central servers, and generative models. Additionally, FIL supports model-agnostic training. These features make FIL particularly promising for fields such as healthcare, where maintaining confidentiality is crucial. Our experiments demonstrated the effectiveness of FIL, which outperformed several FL methods on large medical (X-ray, MRI, and PET) and natural (CIFAR-10) image dataset in a dynamically accumulating database environment, with consistently higher precision, recall, Dice score, and lower standard deviation between participants. In particular, in the PET dataset, FIL achieved about a 40% improvement in Dice score and recall.

Keywords