IEEE Access (Jan 2023)

Low-Complexity Q-Learning for Energy-Aware Small-Cell Networks With Integrated Access and Backhaul

  • Junseung Lee,
  • Hyun-Ho Choi,
  • Seung-Chan Lim,
  • Hyungsub Kim,
  • Jeehyeon Na,
  • Howon Lee

DOI
https://doi.org/10.1109/ACCESS.2023.3328957
Journal volume & issue
Vol. 11
pp. 121529 – 121538

Abstract

Read online

An integrated access and backhaul (IAB)-enabled small-cell network commonly utilizes frequency channels for access and backhaul links, and thus this network has a chance to utilize the frequency channels efficiently and optimally. However, there are still several problems with applying the IAB technology to practical small-cell networks, such as extremely high computational complexity caused by shared resource utilization and additional co-tier and cross-tier interference management. Therefore, we herein propose a multi-agent distributed Q-learning with pre-resource partitioning (MADQ-PRP) algorithm to solve the problem of frequency channel allocation and energy consumption. In MADQ-PRP, to reduce the computational complexity, each RL agent only considers its local state information to determine its following action. Nevertheless, by sharing and redistributing the rewards among agents, the overall reward can be maximized. Furthermore, we devise a pre-resource partitioning method depending on the variations in the number of SBSs per MBS and the numbers of MBS and SBS channels to reduce the computational complexity of the proposed MADQ-PRP algorithm. Through intensive simulations, we show the convergence of the proposed MADQ-PRP algorithm to the optimal solution obtained by the exhaustive search algorithm. Also, we demonstrate that the proposed MADQ-PRP algorithm outperforms several benchmark algorithms such as ‘Random action,’ ‘SBS on-off,’ ‘SBS-only,’ and ‘MADQ-only’ in IAB-enabled small-cell networks with non-uniform traffic distribution. Furthermore, it is confirmed that the proposed MADQ-PRP algorithm can reduce the CPU execution time by 9.1% and 97.9% compared to the distributed and centralized RL algorithms, respectively. The proposed algorithm based on the low-complexity RL and PRP could be one of the solutions to optimize the heterogeneous network performance from the perspective of the network operators when considering the coverage-capacity tradeoff.

Keywords