AoI-Oriented Resource Allocation for NOMA-Based Wireless Powered Cognitive Radio Networks Based on Multi-Agent Deep Reinforcement Learning

Tao He; Yingsheng Peng; Yong Liu; Hui Song

doi:10.1109/ACCESS.2024.3401624

IEEE Access (Jan 2024)

AoI-Oriented Resource Allocation for NOMA-Based Wireless Powered Cognitive Radio Networks Based on Multi-Agent Deep Reinforcement Learning

Tao He,
Yingsheng Peng,
Yong Liu,
Hui Song

Affiliations

Tao He: ORCiD; School of Electronics and Information Engineering, South China Normal University, Foshan, China
Yingsheng Peng: School of Electronics and Communication Engineering, Sun Yat-sen University, Shenzhen, China
Yong Liu: ORCiD; School of Electronics and Information Engineering, South China Normal University, Foshan, China
Hui Song: ORCiD; School of Electronics and Information Engineering, South China Normal University, Foshan, China

DOI: https://doi.org/10.1109/ACCESS.2024.3401624
Journal volume & issue: Vol. 12
pp. 69738 – 69752

Abstract

Read online

In this paper, we study a wireless powered cognitive internet of things (IoT) network, where cognitive radio (CR) and non-orthogonal multiple access (NOMA) technologies are exploited to improve spectral efficiency, and radio frequency based energy harvesting (RF-EH) technology is integrated to achieve the sustainable IoT network. To ensure the freshness of information delivery, we investigate the age of information (AoI) as a performance metric, and formulate a long-term average AoI minimization problem under energy sustainability constraint, in which the working mode and transmit power of the secondary devices (SDs) are jointly optimized. Then, we reformulate it as a decentralized Markov decision process (Dec-MDP) with continuous action space. Accordingly, a deep reinforcement learning (DRL) framework is exploited, and a multi-agent twin delayed deep deterministic policy gradient algorithm with dual action selection mechanism (MATD3-DAS) is proposed, which adopts the centralized training and decentralized execution (CTDE) framework and exploits both actor and critic networks to select actions for improving exploration ability. Simulation results show that the proposed algorithm can significantly reduce the long-term average AoI, where the decrements approach 9.58% and 52.34% compared with the MATD3 algorithm and TD3-DAS algorithm with centralized training and centralized execution (CTCE).

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords