Comparing deep reinforcement learning architectures for autonomous racing

Benjamin David Evans; Hendrik Willem Jordaan; Herman Arnold Engelbrecht

Machine Learning with Applications (Dec 2023)

Comparing deep reinforcement learning architectures for autonomous racing

Benjamin David Evans,
Hendrik Willem Jordaan,
Herman Arnold Engelbrecht

Affiliations

Benjamin David Evans: Corresponding author.; Stellenbosch University, Electrical and Electronic Engineering, Banghoek Road, Stellenbosch, South Africa
Hendrik Willem Jordaan: Stellenbosch University, Electrical and Electronic Engineering, Banghoek Road, Stellenbosch, South Africa
Herman Arnold Engelbrecht: Stellenbosch University, Electrical and Electronic Engineering, Banghoek Road, Stellenbosch, South Africa

Journal volume & issue: Vol. 14
p. 100496

Abstract

Read online

In classical autonomous racing, a perception, planning, and control pipeline is employed to navigate vehicles around a track as quickly as possible. In contrast, neural network controllers have been used to replace either part of or the entire pipeline. This paper compares three deep learning architectures for F1Tenth autonomous racing: full planning, which replaces the global and local planner, trajectory tracking, which replaces the local planner and end-to-end, which replaces the entire pipeline. The evaluation contrasts two reward signals, compares the DDPG, TD3 and SAC algorithms and investigates the generality of the learned policies to different test maps. Training the agents in simulation shows that the full planning agent has the most robust training and testing performance. The trajectory tracking agents achieve fast lap times on the training map but low completion rates on different test maps. Transferring the trained agents to a physical F1Tenth car reveals that the trajectory tracking and full planning agents transfer poorly, displaying rapid side-to-side swerving (slaloming). In contrast, the end-to-end agent, the worst performer in simulation, transfers the best to the physical vehicle and can complete the test track with a maximum speed of 5 m/s. These results show that planning methods outperform end-to-end approaches in simulation performance, but end-to-end approaches transfer better to physical robots.

Published in Machine Learning with Applications

ISSN: 2666-8270 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Science: Science (General): Cybernetics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.journals.elsevier.com/machine-learning-with-applications

About the journal

Abstract

Keywords