A Power Allocation Scheme for MIMO-NOMA and D2D Vehicular Edge Computing Based on Decentralized DRL

Dunxing Long; Qiong Wu; Qiang Fan; Pingyi Fan; Zhengquan Li; Jing Fan

doi:10.3390/s23073449

Sensors (Mar 2023)

A Power Allocation Scheme for MIMO-NOMA and D2D Vehicular Edge Computing Based on Decentralized DRL

Dunxing Long,
Qiong Wu,
Qiang Fan,
Pingyi Fan,
Zhengquan Li,
Jing Fan

Affiliations

Dunxing Long: School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China
Qiong Wu: School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China
Qiang Fan: Qualcomm, San Jose, CA 95110, USA
Pingyi Fan: Department of Electronic Engineering, Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
Zhengquan Li: School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China
Jing Fan: University Key Laboratory of Information and Communication on Security Backup and Recovery in Yunnan Province, Yunnan Minzu University, Kunming 650500, China

DOI: https://doi.org/10.3390/s23073449
Journal volume & issue: Vol. 23, no. 7
p. 3449

Abstract

Read online

In vehicular edge computing (VEC), some tasks can be processed either locally or on the mobile edge computing (MEC) server at a base station (BS) or a nearby vehicle. In fact, tasks are offloaded or not, based on the status of vehicle-to-infrastructure (V2I) and vehicle-to-vehicle (V2V) communication. In this paper, device-to-device (D2D)-based V2V communication and multiple-input multiple-output and nonorthogonal multiple access (MIMO-NOMA)-based V2I communication are considered. In actual communication scenarios, the channel conditions for MIMO-NOMA-based V2I communication are uncertain, and the task arrival is random, leading to a highly complex environment for VEC systems. To solve this problem, we propose a power allocation scheme based on decentralized deep reinforcement learning (DRL). Since the action space is continuous, we employ the deep deterministic policy gradient (DDPG) algorithm to obtain the optimal policy. Extensive experiments demonstrate that our proposed approach with DRL and DDPG outperforms existing greedy strategies in terms of power consumption and reward.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords