Adaptive control of unmanned surface vehicle based on improved DDPG algorithm

Lifei SONG; Chuanyi XU; Le HAO; Rong GUO; Wei CHAI

doi:10.19693/j.issn.1673-3185.03122

Zhongguo Jianchuan Yanjiu (Feb 2024)

Adaptive control of unmanned surface vehicle based on improved DDPG algorithm

Lifei SONG,
Chuanyi XU,
Le HAO,
Rong GUO,
Wei CHAI

Affiliations

Lifei SONG: Key Laboratory of High Performance Ship Technology of Ministry of Education, Wuhan University of Technology, Wuhan 430063, China
Chuanyi XU: Key Laboratory of High Performance Ship Technology of Ministry of Education, Wuhan University of Technology, Wuhan 430063, China
Le HAO: Key Laboratory of High Performance Ship Technology of Ministry of Education, Wuhan University of Technology, Wuhan 430063, China
Rong GUO: Key Laboratory of High Performance Ship Technology of Ministry of Education, Wuhan University of Technology, Wuhan 430063, China
Wei CHAI: Key Laboratory of High Performance Ship Technology of Ministry of Education, Wuhan University of Technology, Wuhan 430063, China

DOI: https://doi.org/10.19693/j.issn.1673-3185.03122
Journal volume & issue: Vol. 19, no. 1
pp. 137 – 144

Abstract

Read online

ObjectiveIn order to tackle the issue of the poor navigation stability of unmanned surface vehicles (USVs) under interference conditions, an intelligent control parameter adjustment strategy based on the deep reinforcement learning (DRL) method is proposed. MethodA dynamic model of a USV combining the line-of-sight (LOS) method and PID navigation controller is established to conduct its navigation control tasks. In view of the time-varying characteristics of PID parameters for course control under interference conditions, the DRL theory is introduced. The environmental state, action and reward functions of the intelligent agent are designed to adjust the PID parameters online. An improved deep deterministic policy gradient (DDPG) algorithm is proposed to increase the convergence speed and address the issue of the occurrence of local optima during the training process. Specifically, the original experience pool is separated into success and failure experience pools, and an adaptive sampling mechanism is designed to optimize the experience pool playback structure. ResultsThe simulation results show that the improved algorithm converges rapidly with a slightly improved average return in the later stages of training. Under interference conditions, the lateral errors and heading angle deviations of the controller based on the improved DDPG algorithm are reduced significantly. Path tracking can be maintained more steadily after fitting the desired path faster.ConclusionThe improved algorithm greatly reduces the cost of training time, enhances the steady-state performance of the agent in the later stages of training and achieves more accurate path tracking.

Published in Zhongguo Jianchuan Yanjiu

ISSN: 1673-3185 (Print)
Publisher: Editorial Office of Chinese Journal of Ship Research
Country of publisher: China
LCC subjects: Naval Science: Naval architecture. Shipbuilding. Marine engineering
Website: http://www.ship-research.com/indexen.htm

About the journal

Abstract

Keywords