Improved Double Deep Q-Network Algorithm Applied to Multi-Dimensional Environment Path Planning of Hexapod Robots

Liuhongxu Chen; Qibiao Wang; Chao Deng; Bo Xie; Xianguo Tuo; Gang Jiang

doi:10.3390/s24072061

Sensors (Mar 2024)

Improved Double Deep Q-Network Algorithm Applied to Multi-Dimensional Environment Path Planning of Hexapod Robots

Liuhongxu Chen,
Qibiao Wang,
Chao Deng,
Bo Xie,
Xianguo Tuo,
Gang Jiang

Affiliations

Liuhongxu Chen: School of Computer Science and Engineering, Sichuan University of Science and Engineering, Zigong 643000, China
Qibiao Wang: School of Computer Science and Engineering, Sichuan University of Science and Engineering, Zigong 643000, China
Chao Deng: School of Physics and Electronic Engineering, Sichuan University of Science and Engineering, Zigong 643000, China
Bo Xie: School of Physics and Electronic Engineering, Sichuan University of Science and Engineering, Zigong 643000, China
Xianguo Tuo: School of Physics and Electronic Engineering, Sichuan University of Science and Engineering, Zigong 643000, China
Gang Jiang: School of Mechanical and Electrical Engineering, Chengdu University of Technology, Chengdu 610059, China

DOI: https://doi.org/10.3390/s24072061
Journal volume & issue: Vol. 24, no. 7
p. 2061

Abstract

Read online

Detecting transportation pipeline leakage points within chemical plants is difficult due to complex pathways, multi-dimensional survey points, and highly dynamic scenarios. However, hexapod robots’ maneuverability and adaptability make it an ideal candidate for conducting surveys across different planes. The path-planning problem of hexapod robots in multi-dimensional environments is a significant challenge, especially when identifying suitable transition points and planning shorter paths to reach survey points while traversing multi-level environments. This study proposes a Particle Swarm Optimization (PSO)-guided Double Deep Q-Network (DDQN) approach, namely, the PSO-guided DDQN (PG-DDQN) algorithm, for solving this problem. The proposed algorithm incorporates the PSO algorithm to supplant the traditional random selection strategy, and the data obtained from this guided approach are subsequently employed to train the DDQN neural network. The multi-dimensional random environment is abstracted into localized maps comprising current and next level planes. Comparative experiments were performed with PG-DDQN, standard DQN, and standard DDQN to evaluate the algorithm’s performance by using multiple randomly generated localized maps. After testing each iteration, each algorithm obtained the total reward values and completion times. The results demonstrate that PG-DDQN exhibited faster convergence under an equivalent iteration count. Compared with standard DQN and standard DDQN, reductions in path-planning time of at least 33.94% and 42.60%, respectively, were observed, significantly improving the robot’s mobility. Finally, the PG-DDQN algorithm was integrated with sensors onto a hexapod robot, and validation was performed through Gazebo simulations and Experiment. The results show that controlling hexapod robots by applying PG-DDQN provides valuable insights for path planning to reach transportation pipeline leakage points within chemical plants.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords