Generalized Federated Learning via Gradient Norm-Aware Minimization and Control Variables

Yicheng Xu; Wubin Ma; Chaofan Dai; Yahui Wu; Haohao Zhou

doi:10.3390/math12172644

Mathematics (Aug 2024)

Generalized Federated Learning via Gradient Norm-Aware Minimization and Control Variables

Yicheng Xu,
Wubin Ma,
Chaofan Dai,
Yahui Wu,
Haohao Zhou

Affiliations

Yicheng Xu: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Wubin Ma: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Chaofan Dai: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Yahui Wu: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Haohao Zhou: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China

DOI: https://doi.org/10.3390/math12172644
Journal volume & issue: Vol. 12, no. 17
p. 2644

Abstract

Read online

Federated Learning (FL) is a promising distributed machine learning framework that emphasizes privacy protection. However, inconsistencies between local optimization objectives and the global objective, commonly referred to as client drift, primarily arise due to non-independently and identically distributed (Non-IID) data, multiple local training steps, and partial client participation in training. The majority of current research tackling this challenge is mainly based on the empirical risk minimization (ERM) principle, while giving little consideration to the connection between the global loss landscape and generalization capability. This study proposes FedGAM, an innovative FL algorithm that incorporates Gradient Norm-Aware Minimization (GAM) to efficiently search for a local flat landscape. FedGAM specifically modifies the client model training objective to simultaneously minimize the loss value and first-order flatness, thereby seeking flat minima. To directly smooth the global flatness, we propose the more significant FedGAM-CV, which employs control variables to correct local updates, guiding each client to train models in a globally flat direction. Experiments on three datasets (CIFAR-10, MNIST, and FashionMNIST) demonstrate that our proposed algorithms outperform existing FL baselines, effectively finding flat minima and addressing the client drift problem.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords