Reinforcement-Learning-Based Edge Offloading Orchestration in Computing Continuum

Ioana Ramona Martin; Gabriel Ioan Arcas; Tudor Cioara

doi:10.3390/computers13110295

Computers (Nov 2024)

Reinforcement-Learning-Based Edge Offloading Orchestration in Computing Continuum

Ioana Ramona Martin,
Gabriel Ioan Arcas,
Tudor Cioara

Affiliations

Ioana Ramona Martin: Computer Science Department, Technical University of Cluj-Napoca, Memorandumului 28, 400114 Cluj-Napoca, Romania
Gabriel Ioan Arcas: Bosch Engineering Center, 400158 Cluj-Napoca, Romania
Tudor Cioara: Computer Science Department, Technical University of Cluj-Napoca, Memorandumului 28, 400114 Cluj-Napoca, Romania

DOI: https://doi.org/10.3390/computers13110295
Journal volume & issue: Vol. 13, no. 11
p. 295

Abstract

Read online

The AI-driven applications and large data generated by IoT devices connected to large-scale utility infrastructures pose significant operational challenges, including increased latency, communication overhead, and computational imbalances. Addressing these is essential to shift the workloads from the cloud to the edge and across the entire computing continuum. However, to achieve this, significant challenges must still be addressed, particularly in decision making to manage the trade-offs associated with workload offloading. In this paper, we propose a task-offloading solution using Reinforcement Learning (RL) to dynamically balance workloads and reduce overloads. We have chosen the Deep Q-Learning algorithm and adapted it to our workload offloading problem. The reward system considers the node’s computational state and type to increase the utilization of the computational resources while minimizing latency and bandwidth utilization. A knowledge graph model of the computing continuum infrastructure is used to address environment modeling challenges and facilitate RL. The learning agent’s performance was evaluated using different hyperparameter configurations and varying episode lengths or knowledge graph model sizes. Results show that for a better learning experience, a low, steady learning rate and a large buffer size are important. Additionally, it offers strong convergence features, with relevant workload tasks and node pairs identified after each learning episode. It also demonstrates good scalability, as the number of offloading pairs and actions increases with the size of the knowledge graph and the episode count.

Published in Computers

ISSN: 2073-431X (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://www.mdpi.com/journal/computers

About the journal

Abstract

Keywords