Deep Reinforcement Learning-Assisted Optimization for Resource Allocation in Downlink OFDMA Cooperative Systems

Mulugeta Kassaw Tefera; Shengbing Zhang; Zengwang Jin

doi:10.3390/e25030413

Entropy (Feb 2023)

Deep Reinforcement Learning-Assisted Optimization for Resource Allocation in Downlink OFDMA Cooperative Systems

Mulugeta Kassaw Tefera,
Shengbing Zhang,
Zengwang Jin

Affiliations

Mulugeta Kassaw Tefera: School of Cybersecurity, Northwestern Polytechnical University, Xi’an 710072, China
Shengbing Zhang: School of Cybersecurity, Northwestern Polytechnical University, Xi’an 710072, China
Zengwang Jin: School of Cybersecurity, Northwestern Polytechnical University, Xi’an 710072, China

DOI: https://doi.org/10.3390/e25030413
Journal volume & issue: Vol. 25, no. 3
p. 413

Abstract

Read online

This paper considers a downlink resource-allocation problem in distributed interference orthogonal frequency-division multiple access (OFDMA) systems under maximal power constraints. As the upcoming fifth-generation (5G) wireless networks are increasingly complex and heterogeneous, it is challenging for resource allocation tasks to optimize the system performance metrics and guarantee user service requests simultaneously. Because of the non-convex optimization problems, using existing approaches to find the optimal resource allocation is computationally expensive. Recently, model-free reinforcement learning (RL) techniques have become alternative approaches in wireless networks to solve non-convex and NP-hard optimization problems. In this paper, we study a deep Q-learning (DQL)-based approach to address the optimization of transmit power control for users in multi-cell interference networks. In particular, we have applied a DQL algorithm for resource allocation to maximize the overall system throughput subject to the maximum power and SINR constraints in a flat frequency channel. We first formulate the optimization problem as a non-cooperative game model, where the multiple BSs compete for spectral efficiencies by improving their achievable utility functions while ensuring the quality of service (QoS) requirements to the corresponding receivers. Then, we develop a DRL-based resource allocation model to maximize the system throughput while satisfying the power and spectral efficiency requirements. In this setting, we define the state-action spaces and the reward function to explore the possible actions and learning outcomes. The numerical simulations demonstrate that the proposed DQL-based scheme outperforms the traditional model-based solution.

Published in Entropy

ISSN: 1099-4300 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Astronomy: Astrophysics; Science: Physics
Website: http://www.mdpi.com/journal/entropy

About the journal

Abstract

Keywords