Actor-critic learning-based energy optimization for UAV access and backhaul networks

Yaxiong Yuan; Lei Lei; Thang X. Vu; Symeon Chatzinotas; Sumei Sun; Björn Ottersten

doi:10.1186/s13638-021-01960-0

EURASIP Journal on Wireless Communications and Networking (Apr 2021)

Actor-critic learning-based energy optimization for UAV access and backhaul networks

Yaxiong Yuan,
Lei Lei,
Thang X. Vu,
Symeon Chatzinotas,
Sumei Sun,
Björn Ottersten

Affiliations

Yaxiong Yuan: Interdisciplinary Center for Security, Reliability and Trust, University of Luxembourg
Lei Lei: Interdisciplinary Center for Security, Reliability and Trust, University of Luxembourg
Thang X. Vu: Interdisciplinary Center for Security, Reliability and Trust, University of Luxembourg
Symeon Chatzinotas: Interdisciplinary Center for Security, Reliability and Trust, University of Luxembourg
Sumei Sun: Institute for Infocomm Research, Agency for Science, Technology, and Research
Björn Ottersten: Interdisciplinary Center for Security, Reliability and Trust, University of Luxembourg

DOI: https://doi.org/10.1186/s13638-021-01960-0
Journal volume & issue: Vol. 2021, no. 1
pp. 1 – 27

Abstract

Read online

Abstract In unmanned aerial vehicle (UAV)-assisted networks, UAV acts as an aerial base station which acquires the requested data via backhaul link and then serves ground users (GUs) through an access network. In this paper, we investigate an energy minimization problem with a limited power supply for both backhaul and access links. The difficulties for solving such a non-convex and combinatorial problem lie at the high computational complexity/time. In solution development, we consider the approaches from both actor-critic deep reinforcement learning (AC-DRL) and optimization perspectives. First, two offline non-learning algorithms, i.e., an optimal and a heuristic algorithms, based on piecewise linear approximation and relaxation are developed as benchmarks. Second, toward real-time decision-making, we improve the conventional AC-DRL and propose two learning schemes: AC-based user group scheduling and backhaul power allocation (ACGP), and joint AC-based user group scheduling and optimization-based backhaul power allocation (ACGOP). Numerical results show that the computation time of both ACGP and ACGOP is reduced tenfold to hundredfold compared to the offline approaches, and ACGOP is better than ACGP in energy savings. The results also verify the superiority of proposed learning solutions in terms of guaranteeing the feasibility and minimizing the system energy compared to the conventional AC-DRL.

Published in EURASIP Journal on Wireless Communications and Networking

ISSN: 1687-1472 (Print); 1687-1499 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Telecommunication; Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics
Website: https://jwcn-eurasipjournals.springeropen.com

About the journal

Abstract

Keywords