IEEE Access (Jan 2021)
A Q-Learning-Based Resource Allocation for Downlink Non-Orthogonal Multiple Access Systems Considering QoS
Abstract
As a technology that can accommodate more users and significantly improve spectral efficiency, non-orthogonal multiple access (NOMA) has attracted the attention of many scholars in recent years. The basic idea of NOMA is to implement multiple access in the power domain and decode the desired signal via successive interference cancellation (SIC). However, the resource allocation problem in such NOMA system is non-convex. It is difficult to directly solve this optimization problem through conventional methods. As such, we propose to apply a reinforcement learning (RL) approach based on cooperative Q-learning to solve the resource allocation problem in multi-antenna downlink NOMA systems. First, we formulate the resource allocation process as a sum rate maximization problem, subject to the power budget constraints and quality of service (QoS) condition. Second, we design a reward function to improve the sum rate while meeting the power and capacity constraints. Multiple Q-tables are created and cooperatively updated to get the optimal beamforming matrix. Then, we analyze the convergence of our proposed RL based power allocation method. Our simulations show that the proposed power allocation scheme yields excellent performance in terms of sum rate, energy efficiency, and spectral efficiency.
Keywords