IEEE Open Journal of the Communications Society (Jan 2023)

Dynamic Data Sample Selection and Scheduling in Edge Federated Learning

  • Mohamed Adel Serhani,
  • Haftay Gebreslasie Abreha,
  • Asadullah Tariq,
  • Mohammad Hayajneh,
  • Yang Xu,
  • Kadhim Hayawi

DOI
https://doi.org/10.1109/OJCOMS.2023.3313257
Journal volume & issue
Vol. 4
pp. 2133 – 2149

Abstract

Read online

Federated Learning (FL) is a state-of-the-art paradigm used in Edge Computing (EC). It enables distributed learning to train on cross-device data, achieving efficient performance, and ensuring data privacy. In the era of Big Data, the Internet of Things (IoT), and data streaming, challenges such as monitoring and management remain unresolved. Edge IoT devices produce and stream huge amounts of sample sources, which can incur significant processing, computation, and storage costs during local updates using all data samples. Many research initiatives have improved the algorithm for FL in homogeneous networks. However, in the typical distributed learning application scenario, data is generated independently by each device, and this heterogeneous data has different distribution characteristics. As a result, the data stream, often characterized as Big Data, used by each device for local learning is unbalanced and is not independent or identically distributed. Such data heterogeneity can degrade the performance of FL and reduce resource utilization. In this paper, we present the DSS-Edge-FL, a Dynamic Sample Selection optimization algorithm that aims to optimize resources and address data heterogeneity. The extensive results of the experiment demonstrate that our proposed approach outperforms the resource efficiency of conventional training methods, with a lower convergence time and improved resource efficiency.

Keywords