IEEE Open Journal of the Communications Society (Jan 2024)
Deep Reinforcement Learning Approach for HAPS User Scheduling in Massive MIMO Communications
Abstract
In this paper, we devise a deep SARSA reinforcement learning (DSRL) user scheduling algorithm for a base station (BS) that uses a high-altitude platform station (HAPS) as a backup to serve multiple users in a wireless cellular network. Considering a realistic scenario, we assume that only the outdated channel state information (CSI) of the terrestrial base station (TBS) is available in our defined user scheduling problem. We model this user scheduling problem using a Markov decision process (MDP) framework, aiming to maximize the sum-rate while minimizing the number of active antennas at the HAPS. Our performance analysis shows that the sum-rate obtained with our proposed DSRL algorithm is close to the optimal sum-rate achieved with an exhaustive search method. We also develop a heuristic optimization method to solve the user scheduling problem at the BS. We show that for a scenario where perfect CSI is not available, our proposed DSRL algorithm outperforms the heuristic optimization method.
Keywords