Dynamic Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive Controller

Jianwen Li; Jalil Chavez-Galaviz; Kamyar Azizzadenesheli; Nina Mahmoudian

doi:10.3390/s23073572

Sensors (Mar 2023)

Dynamic Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive Controller

Jianwen Li,
Jalil Chavez-Galaviz,
Kamyar Azizzadenesheli,
Nina Mahmoudian

Affiliations

Jianwen Li: The School of Mechanical Engineering, Purdue University, West Lafayette, IN 47907, USA
Jalil Chavez-Galaviz: The School of Mechanical Engineering, Purdue University, West Lafayette, IN 47907, USA
Kamyar Azizzadenesheli: Nvidia Corporation, Santa Clara, CA 95051, USA
Nina Mahmoudian: The School of Mechanical Engineering, Purdue University, West Lafayette, IN 47907, USA

DOI: https://doi.org/10.3390/s23073572
Journal volume & issue: Vol. 23, no. 7
p. 3572

Abstract

Read online

This work presents a framework that allows Unmanned Surface Vehicles (USVs) to avoid dynamic obstacles through initial training on an Unmanned Ground Vehicle (UGV) and cross-domain retraining on a USV. This is achieved by integrating a Deep Reinforcement Learning (DRL) agent that generates high-level control commands and leveraging a neural network based model predictive controller (NN-MPC) to reach target waypoints and reject disturbances. A Deep Q Network (DQN) utilized in this framework is trained in a ground environment using a Turtlebot robot and retrained in a water environment using the BREAM USV in the Gazebo simulator to avoid dynamic obstacles. The network is then validated in both simulation and real-world tests. The cross-domain learning largely decreases the training time (28%) and increases the obstacle avoidance performance (70 more reward points) compared to pure water domain training. This methodology shows that it is possible to leverage the data-rich and accessible ground environments to train DRL agent in data-poor and difficult-to-access marine environments. This will allow rapid and iterative agent development without further training due to the change in environment or vehicle dynamics.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords