Simultaneous Control and Guidance of an AUV Based on Soft Actor–Critic

Yoann Sola; Gilles Le Chenadec; Benoit Clement

doi:10.3390/s22166072

Sensors (Aug 2022)

Simultaneous Control and Guidance of an AUV Based on Soft Actor–Critic

Yoann Sola,
Gilles Le Chenadec,
Benoit Clement

Affiliations

Yoann Sola: Lab-STICC UMR CNRS 6285, ENSTA Bretagne, 29200 Brest, France
Gilles Le Chenadec: Lab-STICC UMR CNRS 6285, ENSTA Bretagne, 29200 Brest, France
Benoit Clement: Lab-STICC UMR CNRS 6285, ENSTA Bretagne, 29200 Brest, France

DOI: https://doi.org/10.3390/s22166072
Journal volume & issue: Vol. 22, no. 16
p. 6072

Abstract

Read online

The marine environment is a hostile setting for robotics. It is strongly unstructured, uncertain, and includes many external disturbances that cannot be easily predicted or modeled. In this work, we attempt to control an autonomous underwater vehicle (AUV) to perform a waypoint tracking task, using a machine learning-based controller. There has been great progress in machine learning (in many different domains) in recent years; in the subfield of deep reinforcement learning, several algorithms suitable for the continuous control of dynamical systems have been designed. We implemented the soft actor–critic (SAC) algorithm, an entropy-regularized deep reinforcement learning algorithm that allows fulfilling a learning task and encourages the exploration of the environment simultaneously. We compared a SAC-based controller with a proportional integral derivative (PID) controller on a waypoint tracking task using specific performance metrics. All tests were simulated via the UUV simulator. We applied these two controllers to the RexROV 2, a six degrees of freedom cube-shaped remotely operated underwater Vehicle (ROV) converted in an AUV. We propose several interesting contributions as a result of these tests, such as making the SAC control and guiding the AUV simultaneously, outperforming the PID controller in terms of energy saving, and reducing the amount of information needed by the SAC algorithm inputs. Moreover, our implementation of this controller allows facilitating the transfer towards real-world robots. The code corresponding to this work is available on GitHub.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords