Multi-Objective Reinforcement Learning for Power Allocation in Massive MIMO Networks: A Solution to Spectral and Energy Trade-Offs

Youngwoo Oh; Arif Ullah; Wooyeol Choi

doi:10.1109/ACCESS.2023.3347788

IEEE Access (Jan 2024)

Multi-Objective Reinforcement Learning for Power Allocation in Massive MIMO Networks: A Solution to Spectral and Energy Trade-Offs

Youngwoo Oh,
Arif Ullah,
Wooyeol Choi

Affiliations

Youngwoo Oh: ORCiD; Department of Computer Engineering, College of IT Convergence, Chosun University, Gwangju, Republic of Korea
Arif Ullah: ORCiD; Department of Computer Engineering, College of IT Convergence, Chosun University, Gwangju, Republic of Korea
Wooyeol Choi: ORCiD; Department of Computer Engineering, College of IT Convergence, Chosun University, Gwangju, Republic of Korea

DOI: https://doi.org/10.1109/ACCESS.2023.3347788
Journal volume & issue: Vol. 12
pp. 1172 – 1188

Abstract

Read online

The joint optimization of spectral efficiency (SE) and energy efficiency (EE) through power allocation (PA) techniques is a critical requirement for emerging fifth-generation and beyond networks. The trade-off between SE and EE becomes challenging in the massive multiple-input-multiple-output (MIMO) equipped base stations (BSs) in multi-cell cellular networks. Various algorithmic approaches including genetic algorithms and convex optimization have been considered to optimize the trade-offs between SE and EE in cellular networks. However, these methods suffer from high computational costs. A promising deep reinforcement learning technique is capable of addressing the computational challenges of single-objective optimization problems in wireless networks. Furthermore, multi-objective reinforcement learning has been employed for multi-objective optimization problems and can be utilized to jointly enhance the SE and EE in cellular networks. In this paper, we propose a downlink (DL) transmit PA method based on a multi-objective asynchronous advantage single actor-multiple critics (MO-A3Cs) architecture. The proposed architecture aims to optimize SE and EE trade-offs in massive MIMO-assisted multi-cell networks. Furthermore, we also propose a Bayesian rule-based preference weight updating mechanism, multi-objective advantage function, and balanced-reward aggregation method to effectively train and avoid biased objective reward during the training process of the proposed model. Extensive simulations depict that the proposed model is better capable of dealing with the joint optimization of SE and EE in dynamic changing scenarios. Compared to the existing benchmarks such as Pareto front approximation-based multi-objective, reinforcement learning-based single objective, and iterative methods, the proposed approach provides a better SE-EE trade-off by achieving a higher EE in multi-cell massive MIMO networks.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords