Alexandria Engineering Journal (Nov 2024)

An optimal resource assignment and mode selection for vehicular communication using proximal on-policy scheme

  • Ishan Budhiraja,
  • Anna Alphy,
  • Pawan Pandey,
  • Sahil Garg,
  • Bong Jun Choi,
  • Mohammad Mehedi Hassan

Journal volume & issue
Vol. 107
pp. 268 – 279

Abstract

Read online

Vehicle-to-everything (V2X) communication is essential in 5G and upcoming networks as it enables seamless interaction between vehicles and infrastructure, ensuring the reliable transmission of critical and time-sensitive data. Challenges like unstable communication in highly mobile vehicular networks, limited channel state information, high transmission overhead, and significant communication costs hinder vehicle-to-vehicle (V2V) communication. To tackle these issues, a unified approach utilizing distributed deep reinforcement learning is proposed to enhance the overall network performance while meeting the quality of service (QoS), latency, and rate requirements. Recognizing the complexity of this NP-hard, non-convex problem, a machine learning framework based on the Markov decision process (MDP) is adopted for a robust strategy. This framework facilitates the formulation of a reward function and the selection of optimal actions with certainty. Furthermore, a spectrum-based allocation framework employing multi-agent deep reinforcement learning (MADRL) is confidently introduced. The deep deterministic policy gradient (DDPG) within this framework enables the exchange of historical data globally during the primary learning phase, effectively removing the need for signal interaction and manual intervention in optimizing system efficiency. The data transmission policy follows an augmented online policy scheme, known as the proximal online policy scheme (POPS), which confidently reduces the computational complexity during the learning process. The complexity is marginally adjusted using the clipping substitute technique with assurance in the learning phase. Simulation results validate that the proposed method outperforms existing decentralized systems in achieving a higher average data transmission rate and ensuring quality of service (QoS) satisfaction confidently.

Keywords