Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning

Qingfeng Zhang; Wenpeng Ma; Qingchun Zheng; Xiaofan Zhai; Wenqian Zhang; Tianchang Zhang; Shuo Wang

doi:10.1109/ACCESS.2024.3507016

IEEE Access (Jan 2024)

Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning

Qingfeng Zhang,
Wenpeng Ma,
Qingchun Zheng,
Xiaofan Zhai,
Wenqian Zhang,
Tianchang Zhang,
Shuo Wang

Affiliations

Qingfeng Zhang: ORCiD; Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, China
Wenpeng Ma: ORCiD; Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, China
Qingchun Zheng: Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, China
Xiaofan Zhai: Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, China
Wenqian Zhang: Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, China
Tianchang Zhang: Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, China
Shuo Wang: Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, China

DOI: https://doi.org/10.1109/ACCESS.2024.3507016
Journal volume & issue: Vol. 12
pp. 189136 – 189152

Abstract

Read online

In this study, to address the issues faced by mobile robots in complex environments, such as sparse rewards caused by limited effective experience, slow learning efficiency in the early stages of training, as well as poor obstacle avoidance performance in environments with dynamic obstacles, the authors proposed a new path planning algorithm for mobile robots by introducing Intrinsic Curiosity Module (ICM) and Long Short-Term Memory (LSTM) into the Proximal Policy Optimization (PPO) algorithm. ICM provided intrinsic rewards in addition to external rewards, accelerating the initial convergence speed. And the Actor-Critic network was optimized by employing a LSTM-based neural network to enhance the performance of avoiding dynamic obstacles. Then, various simulation experiments were conducted in Gazebo with scenarios featuring both static and dynamic obstacles, and the TurtleBot3 mobile robot was used for experimental verification. The experiments demonstrate that the proposed algorithm significantly accelerates convergence in environments with sparse rewards compared to traditional algorithms, and the robot can find more target points within a single episode, indicating more effective path planning ability. Additionally, it can avoid obstacles in various states. Finally, the effectiveness of the algorithm was validated using the TurtleBot3 physical robot in real-world scenarios. Results show that compared to Deep Deterministic Policy Gradient (DDPG), PPO, LSTM-PPO and ICM-PPO, the success rate of the proposed algorithm in path planning increases by 9%, 7.2%, 4% and 4.8% in the most complex simulation environment and by 20%, 16%, 10% and 12% in the physical environment, respectively.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords