Hierarchical Reinforcement Learning Based Resource Allocation for RAN Slicing

Hasan Anil Akyildiz; Omer Faruk Gemici; Ibrahim Hokelek; Hakan Ali Cirpan

doi:10.1109/ACCESS.2024.3406949

IEEE Access (Jan 2024)

Hierarchical Reinforcement Learning Based Resource Allocation for RAN Slicing

Hasan Anil Akyildiz,
Omer Faruk Gemici,
Ibrahim Hokelek,
Hakan Ali Cirpan

Affiliations

Hasan Anil Akyildiz: ORCiD; Electronics and Communication Department, Istanbul Technical University, İstanbul, Turkey
Omer Faruk Gemici: ORCiD; Electronics and Communication Department, Istanbul Technical University, İstanbul, Turkey
Ibrahim Hokelek: ORCiD; Electronics and Communication Department, Istanbul Technical University, İstanbul, Turkey
Hakan Ali Cirpan: ORCiD; Electronics and Communication Department, Istanbul Technical University, İstanbul, Turkey

DOI: https://doi.org/10.1109/ACCESS.2024.3406949
Journal volume & issue: Vol. 12
pp. 75818 – 75831

Abstract

Read online

As the complexity of wireless mobile networks increases significantly, artificial intelligence (AI) and machine learning (ML) have become key enablers for radio resource management and orchestration. In this paper, we propose a multi-agent reinforcement learning (RL) method for allocating radio resources to mobile users under random traffic arrivals, in which Ultra-Reliable Low-Latency Communications (URLLC) and enhanced Mobile Broad-Band (eMBB) services are jointly considered in the same radio access network (RAN). The proposed system includes hierarchically placed RL agents, where the main-agent residing on the upper hierarchy performs inter-slice resource allocation between the URLLC and eMBB slices. The URLLC and eMBB sub-agents are responsible for the resource allocation within their own slice, where the objective is to maximize the eMBB throughput while satisfying the latency requirements of the URLLC slice. In the RL algorithm, the state space includes the queue occupancy and the channel quality information of mobile users while the action space specifies the resource allocation to the users. For a computationally efficient RL training, the state space is significantly reduced by quantizing the queue occupancy and grouping the users according to their channel qualities. The numerical results for the URLLC show that the proposed RL-based approach provides the average delay results of lower than 1 ms for all experiments while the worst case eMBB throughput degradation is limited to 4%.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords