Federated learning based on stratified sampling and regularization

Chenyang Lu; Wubin Ma; Rui Wang; Su Deng; Yahui Wu

doi:10.1007/s40747-022-00895-3

Complex & Intelligent Systems (Nov 2022)

Federated learning based on stratified sampling and regularization

Chenyang Lu,
Wubin Ma,
Rui Wang,
Su Deng,
Yahui Wu

Affiliations

Chenyang Lu: Science and Technology On Information Systems Engineering Laboratory, No. 1 Institute of National University of Defence Technology
Wubin Ma: Science and Technology On Information Systems Engineering Laboratory, No. 1 Institute of National University of Defence Technology
Rui Wang: Science and Technology On Information Systems Engineering Laboratory, No. 1 Institute of National University of Defence Technology
Su Deng: Science and Technology On Information Systems Engineering Laboratory, No. 1 Institute of National University of Defence Technology
Yahui Wu: Science and Technology On Information Systems Engineering Laboratory, No. 1 Institute of National University of Defence Technology

DOI: https://doi.org/10.1007/s40747-022-00895-3
Journal volume & issue: Vol. 9, no. 2
pp. 2081 – 2099

Abstract

Read online

Abstract Federated learning (FL) is a new distributed learning framework that is different from traditional distributed machine learning: (1) differences in communication, computing, and storage performance among devices (device heterogeneity), (2) differences in data distribution and data volume (data heterogeneity), and (3) high communication consumption. Under heterogeneous conditions, the data distribution of clients varies greatly, which leads to the problem that the convergence speed of the training model decreases and the training model cannot converge to the global optimal solution. In this work, an FL algorithm based on stratified sampling and regularization (FedSSAR) is proposed. In FedSSAR, a density-based clustering method is used to divide the overall client into different clusters, then, some available clients are proportionally extracted from different clusters to participate in training which realizes unbiased sampling for the overall client and reduces the aggregation weight variance of the client. At the same time, when calculating the model local loss function, we limit the update direction of the model by a regular term, so that heterogeneous clients are optimized in the globally optimal direction. We prove the convergence of FedSSAR theoretically and experimentally, and demonstrate the superiority of FedSSAR by comparing it with other FL algorithms on public datasets.

Published in Complex & Intelligent Systems

ISSN: 2199-4536 (Print); 2198-6053 (Online)
Publisher: Springer
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science; Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://www.springer.com/journal/40747

About the journal

Abstract

Keywords