IEEE Access (Jan 2024)
FedOps: A Platform of Federated Learning Operations With Heterogeneity Management
Abstract
Federated learning (FL) is a decentralized machine learning (ML) method that enables model training while preserving privacy. FL is gaining attention because it avoids data transfer to the server, facilitating the decentralized learning of the traditional ML model. Despite its potential, FL project is significantly more challenging to develop than centralized ML methods owing to decentralized local data. We propose FedOps, federated learning operations for constructing systematic FL project by enhancing machine learning operations (MLOps) to be effectively applied to FL while preserving its core process. To address complexity of FL implementation, we developed FedOps platform, which involves FedOps-based projects to manage the whole lifecycle in FL context. We also investigated methods to identify performance degradation factors in FL and suggest an approach for improvement. FedOps Platform provides an analysis tool for client heterogeneity, called chunk-bench. This tool enables researchers and engineers to gain insights into systems heterogeneity by using only small chunk of the clients’ data to execute test in the shortest time possible while tracking the systems heterogeneity across the clients. By addressing systems heterogeneity, FedOps Platform achieved 13%–43% improvement in communication cost-to-accuracy and 20%–68% improvement in time-to-accuracy. We believe that FedOps Platform offers an optimal solution for end-to-end development of FL projects, with significantly improving both computational and communication efficiencies.
Keywords