National Science Open (Jan 2023)

DeceFL: a principled fully decentralized federated learning framework

  • Yuan Ye,
  • Liu Jun,
  • Jin Dou,
  • Yue Zuogong,
  • Yang Tao,
  • Chen Ruijuan,
  • Wang Maolin,
  • Xu Lei,
  • Hua Feng,
  • Guo Yuqi,
  • Tang Xiuchuan,
  • He Xin,
  • Yi Xinlei,
  • Li Dong,
  • Yu Wenwu,
  • Zhang Hai-Tao,
  • Chai Tianyou,
  • Sui Shaochun,
  • Ding Han

DOI
https://doi.org/10.1360/nso/20220043
Journal volume & issue
Vol. 2

Abstract

Read online

Traditional machine learning relies on a centralized data pipeline for model training in various applications; however, data are inherently fragmented. Such a decentralized nature of databases presents the serious challenge for collaboration: sending all decentralized datasets to a central server raises serious privacy concerns. Although there has been a joint effort in tackling such a critical issue by proposing privacy-preserving machine learning frameworks, such as federated learning, most state-of-the-art frameworks are built still in a centralized way, in which a central client is needed for collecting and distributing model information (instead of data itself) from every other client, leading to high communication burden and high vulnerability when there exists a failure at or an attack on the central client. Here we propose a principled decentralized federated learning algorithm (DeceFL), which does not require a central client and relies only on local information transmission between clients and their neighbors, representing a fully decentralized learning framework. It has been further proven that every client reaches the global minimum with zero performance gap and achieves the same convergence rate $O(1/T)$ (where $T$ is the number of iterations in gradient descent) as centralized federated learning when the loss function is smooth and strongly convex. Finally, the proposed algorithm has been applied to a number of applications to illustrate its effectiveness for both convex and nonconvex loss functions, time-invariant and time-varying topologies, as well as IID and Non-IID of datasets, demonstrating its applicability to a wide range of real-world medical and industrial applications.

Keywords