An Empirical Study of DDPG and PPO-Based Reinforcement Learning Algorithms for Autonomous Driving

Sanjna Siboo; Anushka Bhattacharyya; Rashmi Naveen Raj; S. H. Ashwin

doi:10.1109/ACCESS.2023.3330665

IEEE Access (Jan 2023)

An Empirical Study of DDPG and PPO-Based Reinforcement Learning Algorithms for Autonomous Driving

Sanjna Siboo,
Anushka Bhattacharyya,
Rashmi Naveen Raj,
S. H. Ashwin

Affiliations

Sanjna Siboo: Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, India
Anushka Bhattacharyya: ORCiD; Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, India
Rashmi Naveen Raj: ORCiD; Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, India
S. H. Ashwin: Standard Chartered Global Business Services, Bengaluru, India

DOI: https://doi.org/10.1109/ACCESS.2023.3330665
Journal volume & issue: Vol. 11
pp. 125094 – 125108

Abstract

Read online

Autonomous vehicles mitigate road accidents and provide safe transportation with a smooth traffic flow. They are expected to greatly improve the quality of the elderly or people with impairments by improving their mobility due to the ease of access to transportation. Autonomous vehicles sense the driving environment and navigate through it without human intervention. And, Deep Reinforcement Learning (DRL) is witnessed as a powerful machine learning solution to address a sequential decision problem in autonomous vehicles. The detailed state-of-the-art work in autonomous vehicles using DRL algorithms along with future research directions is discussed. Due to the high dimensional action space, two continuous action space DRL algorithms: Deterministic Policy Gradient (DDPG) and Proximal Policy Optimization (PPO) are chosen to address the complex autonomous driving problem. The proposed DDPG and PPO based decision-making models are trained and tested using the TORC simulator. Both the algorithms are trained for the same number of episodes for lane keeping as well as multi-agent collision avoidance scenarios. To the best of our knowledge, this is the first paper to present the comparative performance analysis of these two algorithms, and DDPG is found to perform better in terms of higher reward and faster convergence than PPO. Hence, DDPG is a suitable option in the design of a decision model for autonomous driving.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords