Big Data Mining and Analytics (Dec 2024)
A Multi-Task Based Clustering Personalized Federated Learning Method
Abstract
Federated Learning (FL) is a framework for machine learning on a large-scale distributed dataset, enabling the training of a collaborative model across multiple parties while preserving the privacy of user data. However, in cases where data are distributed in a non-independent and identically distributed (non-iid) manner, the convergence speed of the federated collaborative model and its prediction accuracy on client nodes can be significantly affected. Therefore, personalized FL methods have emerged to further adapt to the data characteristics of different clients. In response to the data heterogeneity issue, this paper presents a multi-task clustering-based personalized federated learning algorithm, which is applied to the prediction of carbon emissions in different regions and enterprises. This algorithm partitions nodes with similar data distributions and aggregates local models within the same cluster to form cluster models. It introduces the concept of multi-task learning, dividing the lower layers of cluster models into expert layers. These expert layers of different cluster models are then weighted and aggregated for the acquisition of global knowledge. Additionally, adaptive weight is applied to control the aggregation of expert layers, thereby achieving model personalization at the local level. Simulation experiments conducted on carbon emission prediction data demonstrate that the proposed algorithm performs better in various evaluation metrics compared with the Federated Averaging (FedAvg) algorithm and traditional clustering personalized federated learning algorithm. It also exhibits excellent experimental results and performance when dealing with different quantities of heterogeneous data distributions.
Keywords