An optimal resource assignment and mode selection for vehicular communication using proximal on-policy scheme

Ishan Budhiraja; Anna Alphy; Pawan Pandey; Sahil Garg; Bong Jun Choi; Mohammad Mehedi Hassan

Alexandria Engineering Journal (Nov 2024)

An optimal resource assignment and mode selection for vehicular communication using proximal on-policy scheme

Ishan Budhiraja,
Anna Alphy,
Pawan Pandey,
Sahil Garg,
Bong Jun Choi,
Mohammad Mehedi Hassan

Affiliations

Ishan Budhiraja: School of Computer Science Engineering & Technology, Bennett University, Greater Noida, Uttar Pradesh, India
Anna Alphy: Department of Computer Science and Engineering, SRM IST, NCR Campus, Ghaziabad, India
Pawan Pandey: Department of Computer Science and Engineering, Raj Kumar Goel Institute of Technology, Ghaziabad, India
Sahil Garg: Electrical Engineering Department, École de technologie supérieure, Montréal, Canada
Bong Jun Choi: School of Computer Science and Engineering, Soongsil University, Seoul, Republic of Korea; Corresponding author.
Mohammad Mehedi Hassan: Department of Information Systems, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia

Journal volume & issue: Vol. 107
pp. 268 – 279

Abstract

Read online

Vehicle-to-everything (V2X) communication is essential in 5G and upcoming networks as it enables seamless interaction between vehicles and infrastructure, ensuring the reliable transmission of critical and time-sensitive data. Challenges like unstable communication in highly mobile vehicular networks, limited channel state information, high transmission overhead, and significant communication costs hinder vehicle-to-vehicle (V2V) communication. To tackle these issues, a unified approach utilizing distributed deep reinforcement learning is proposed to enhance the overall network performance while meeting the quality of service (QoS), latency, and rate requirements. Recognizing the complexity of this NP-hard, non-convex problem, a machine learning framework based on the Markov decision process (MDP) is adopted for a robust strategy. This framework facilitates the formulation of a reward function and the selection of optimal actions with certainty. Furthermore, a spectrum-based allocation framework employing multi-agent deep reinforcement learning (MADRL) is confidently introduced. The deep deterministic policy gradient (DDPG) within this framework enables the exchange of historical data globally during the primary learning phase, effectively removing the need for signal interaction and manual intervention in optimizing system efficiency. The data transmission policy follows an augmented online policy scheme, known as the proximal online policy scheme (POPS), which confidently reduces the computational complexity during the learning process. The complexity is marginally adjusted using the clipping substitute technique with assurance in the learning phase. Simulation results validate that the proposed method outperforms existing decentralized systems in achieving a higher average data transmission rate and ensuring quality of service (QoS) satisfaction confidently.

Published in Alexandria Engineering Journal

ISSN: 1110-0168 (Print); 2090-2670 (Online)
Publisher: Elsevier
Country of publisher: Egypt
LCC subjects: Technology: Engineering (General). Civil engineering (General)
Website: http://www.journals.elsevier.com/alexandria-engineering-journal/

About the journal

Abstract

Keywords