Research on Obstacle Avoidance Planning for UUV Based on A3C Algorithm

Hongjian Wang; Wei Gao; Zhao Wang; Kai Zhang; Jingfei Ren; Lihui Deng; Shanshan He

doi:10.3390/jmse12010063

Journal of Marine Science and Engineering (Dec 2023)

Research on Obstacle Avoidance Planning for UUV Based on A3C Algorithm

Hongjian Wang,
Wei Gao,
Zhao Wang,
Kai Zhang,
Jingfei Ren,
Lihui Deng,
Shanshan He

Affiliations

Hongjian Wang: College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
Wei Gao: College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
Zhao Wang: College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
Kai Zhang: College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
Jingfei Ren: College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
Lihui Deng: College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
Shanshan He: College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China

DOI: https://doi.org/10.3390/jmse12010063
Journal volume & issue: Vol. 12, no. 1
p. 63

Abstract

Read online

Deep reinforcement learning is an artificial intelligence technology that combines deep learning and reinforcement learning and has been widely applied in multiple fields. As a type of deep reinforcement learning algorithm, the A3C (Asynchronous Advantage Actor-Critic) algorithm can effectively utilize computer resources and improve training efficiency by synchronously training Actor-Critic in multiple threads. Inspired by the excellent performance of the A3C algorithm, this paper uses the A3C algorithm to solve the UUV (Unmanned Underwater Vehicle) collision avoidance planning problem in unknown environments. This collision avoidance planning algorithm can have the ability to plan in real-time while ensuring a shorter path length, and the output action space can meet the kinematic constraints of UUVs. In response to the problem of UUV collision avoidance planning, this paper designs the state space, action space, and reward function. The simulation results show that the A3C collision avoidance planning algorithm can guide a UUV to avoid obstacles and reach the preset target point. The path planned by this algorithm meets the heading constraints of the UUV, and the planning time is short, which can meet the requirements of real-time planning.

Published in Journal of Marine Science and Engineering

ISSN: 2077-1312 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Naval Science: Naval architecture. Shipbuilding. Marine engineering; Geography. Anthropology. Recreation: Oceanography
Website: http://www.mdpi.com/journal/jmse

About the journal

Abstract

Keywords