Double Deep Q-Network-Based Energy-Efficient Resource Allocation in Cloud Radio Access Network

Amjad Iqbal; Mau-Luen Tham; Yoong Choon Chang

doi:10.1109/ACCESS.2021.3054909

IEEE Access (Jan 2021)

Double Deep Q-Network-Based Energy-Efficient Resource Allocation in Cloud Radio Access Network

Amjad Iqbal,
Mau-Luen Tham,
Yoong Choon Chang

Affiliations

Amjad Iqbal: ORCiD; Department of Electrical and Electronic Engineering, Lee Kong Chian Faculty of Engineering and Science, Universiti Tunku Abdul Rahman, Sungai Long Campus, Selangor, Malaysia
Mau-Luen Tham: Department of Electrical and Electronic Engineering, Lee Kong Chian Faculty of Engineering and Science, Universiti Tunku Abdul Rahman, Sungai Long Campus, Selangor, Malaysia
Yoong Choon Chang: Department of Electrical and Electronic Engineering, Lee Kong Chian Faculty of Engineering and Science, Universiti Tunku Abdul Rahman, Sungai Long Campus, Selangor, Malaysia

DOI: https://doi.org/10.1109/ACCESS.2021.3054909
Journal volume & issue: Vol. 9
pp. 20440 – 20449

Abstract

Read online

Cloud radio access network (CRAN) has been shown as an effective means to boost network performance. Such gain stems from the intelligent management of remote radio heads (RRHs) in terms of on/off operation mode and power consumption. Most conventional resource allocation (RA) methods, however, optimize the network utility without considering the switching overhead of RRHs in adjacent time intervals. When the network environment becomes time-correlated, mathematical optimization is not directly applicable. In this paper, we aim to optimize the energy efficiency (EE) subject to the constraints on per-RRH transmission power and user data rates. To this end, we formulate the EE problem as a Markov decision process (MDP) and subsequently adopt deep reinforcement learning (DRL) technique to reap the cumulative EE rewards. Our starting point is the deep Q network (DQN), which is a combination of deep learning and Q-learning. In each time slot, DQN configures the status of RRHs yielding the largest Q-value (known as state-action value) prior to solving a power minimization problem for active RRHs. To overcome the Q-value overestimation issue of DQN, we propose a Double DQN (DDQN) framework that obtains optimal reward better than DQN by separating the selected action from the target Q-value generator. Simulation results validate that the DDQN-based RA method is more energy-efficient than the DQN-based RA algorithm and a baseline solution.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords