A Self-Adaptive Double Q-Backstepping Trajectory Tracking Control Approach Based on Reinforcement Learning for Mobile Robots

Naifeng He; Zhong Yang; Xiaoliang Fan; Jiying Wu; Yaoyu Sui; Qiuyan Zhang

doi:10.3390/act12080326

Actuators (Aug 2023)

A Self-Adaptive Double Q-Backstepping Trajectory Tracking Control Approach Based on Reinforcement Learning for Mobile Robots

Naifeng He,
Zhong Yang,
Xiaoliang Fan,
Jiying Wu,
Yaoyu Sui,
Qiuyan Zhang

Affiliations

Naifeng He: College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
Zhong Yang: College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
Xiaoliang Fan: State Key Laboratory of Robotics, Shenyang Institute of Automation Chinese Academy of Sciences, Shenyang 110017, China
Jiying Wu: College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
Yaoyu Sui: College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
Qiuyan Zhang: Electric Power Research Institute of Guizhou Power Grid Co., Ltd., Guiyang 550002, China

DOI: https://doi.org/10.3390/act12080326
Journal volume & issue: Vol. 12, no. 8
p. 326

Abstract

Read online

When a mobile robot inspects tasks with complex requirements indoors, the traditional backstepping method cannot guarantee the accuracy of the trajectory, leading to problems such as the instrument not being inside the image and focus failure when the robot grabs the image with high zoom. In order to solve this problem, this paper proposes an adaptive backstepping method based on double Q-learning for tracking and controlling the trajectory of mobile robots. We design the incremental model-free algorithm of Double-Q learning, which can quickly learn to rectify the trajectory tracking controller gain online. For the controller gain rectification problem in non-uniform state space exploration, we propose an incremental active learning exploration algorithm that incorporates memory playback as well as experience playback mechanisms to achieve online fast learning and controller gain rectification for agents. To verify the feasibility of the algorithm, we perform algorithm verification on different types of trajectories in Gazebo and physical platforms. The results show that the adaptive trajectory tracking control algorithm can be used to rectify the mobile robot trajectory tracking controller’s gain. Compared with the Backstepping-Fractional-Older PID controller and Fuzzy-Backstepping controller, Double Q-backstepping has better robustness, generalization, real-time, and stronger anti-disturbance capability.

Published in Actuators

ISSN: 2076-0825 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Materials of engineering and construction. Mechanics of materials; Technology: Electrical engineering. Electronics. Nuclear engineering: Production of electric energy or power. Powerplants. Central stations
Website: http://www.mdpi.com/journal/actuators

About the journal

Abstract

Keywords