Training Is Execution: A Reinforcement Learning-Based Collision Avoidance Algorithm for Volatile Scenarios

Jian Ban; Gongyan Li

doi:10.1109/ACCESS.2024.3448292

IEEE Access (Jan 2024)

Training Is Execution: A Reinforcement Learning-Based Collision Avoidance Algorithm for Volatile Scenarios

Jian Ban,
Gongyan Li

Affiliations

Jian Ban: ORCiD; Chinese Academy of Sciences, Institute of Microelectronics, Beijing, China
Gongyan Li: Chinese Academy of Sciences, Institute of Microelectronics, Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2024.3448292
Journal volume & issue: Vol. 12
pp. 116956 – 116967

Abstract

Read online

Collision avoidance is one of the most fundamental and challenging aspects of socially aware robot navigation. Recently, numerous works based on reinforcement learning (RL) have been proposed for collision avoidance in experimental settings. However, reinforcement learning approaches face two challenges: managing lengthy training processes and generalizing in volatile environments. These issues arise because reinforcement learning must initiate training whenever it encounters new environments or environmental changes. In this work, we present an algorithm that executes without prior training and adaptively optimizes itself when the environment changes. Its training is its execution. The novelty of this work is twofold: 1. developing a fluctuating epsilon model, which is entirely new and designed from scratch. It adjusts the exploration probability based on the performance of the algorithm. 2. developing a velocity obstacle-based exploration model, which innovatively combines Reinforcement Learning (RL) with the Velocity Obstacles (VO) method by replacing random exploration in RL with the classic collision avoidance algorithm, the Optimal Reciprocal Collision Avoidance (ORCA) algorithm, to create our innovative exploration model. Our proposed algorithm is evaluated in simulation scenarios. Experimental results demonstrate that our comprehensive model outperforms the state-of-the-art (SOTA) baseline algorithms, achieving a 3.23% increase in success rate. Moreover, our model continues to excel even when the environment changes, consistently improving its performance. Specifically, after environmental changes, our model’s success rate exceeds that of the baseline by 17.49%.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords