IEEE Open Journal of the Communications Society (Jan 2023)

A Continuous Actor–Critic Deep Q-Learning-Enabled Deployment of UAV Base Stations: Toward 6G Small Cells in the Skies of Smart Cities

  • Nahid Parvaresh,
  • Burak Kantarci

DOI
https://doi.org/10.1109/OJCOMS.2023.3251297
Journal volume & issue
Vol. 4
pp. 700 – 712

Abstract

Read online

Uncrewed aerial vehicle-mounted base stations (UAV-BSs), also know as drone base stations, are considered to have promising potential to tackle the limitations of ground base stations. They can provide cost-effective Internet connection to users that are out of infrastructure. They can also take over quickly as service providers when ground base stations fail in an unanticipated manner. UAV-BSs benefit from their mobile nature that enables them to change their 3D locations if the demand profile changes rapidly. In order to effectively leverage the mobility of UAV-BSs so as to maximize the performance of the network, 3D location of UAV-BSs requires continuous optimization. However, solving the optimization problem of UAV-BSs is NP-hard with no deterministic solution in polynomial time. In this paper, we propose a continuous actor-critic deep reinforcement learning solution in order to solve the location optimization problem of UAV-BSs in the presence of mobile endpoints. The simulation results show that the proposed model significantly improves the network performance compared to Q-learning, deep Q-learning and conventional algorithms. While the Q-learning and deep Q-learning-based baselines reach the sum data rate of 35 Mbps and 42 Mbps respectively, our proposed ACDQL-based strategy maximizes the sum data rate of endpoints to 45 Mbps. Furthermore, the proposed ACDQL-based methodology reduces the convergence time of the UAV-BS placement optimization by 85 percent compared to the Q-learning and deep Q-learning baselines.

Keywords