Real-time Trajectory Planning Algorithm Based on Collision Criticality and Deep Reinforcement Learning

XU Linling, ZHOU Yuan, HUANG Hongyun, LIU Yang

doi:10.11896/jsjkx.220100007

Jisuanji kexue (Mar 2023)

Real-time Trajectory Planning Algorithm Based on Collision Criticality and Deep Reinforcement Learning

XU Linling, ZHOU Yuan, HUANG Hongyun, LIU Yang

Affiliations

XU Linling, ZHOU Yuan, HUANG Hongyun, LIU Yang: 1 School of Information Science and Technology,Zhejiang Sci-Tech University, Hangzhou 310018, China;2 School of Computer Science and Engineering,Nanyang Technological University,Singapore 639798,Singapore;3 Center of Library Big Data Processing and Analysis,Zhejiang Sci-Tech University,Hangzhou 310018,China

DOI: https://doi.org/10.11896/jsjkx.220100007
Journal volume & issue: Vol. 50, no. 3
pp. 323 – 332

Abstract

Read online

Real-time collision avoidance in dynamic environments is a challenge in trajectory planning of mobile robots. Focusing on environments with variable number of obstacles,this paper proposes a real-time trajectory planning algorithm,Crit-LSTM-DRL,based on long short-term memory(LSTM) and deep reinforcement learning(DRL). First,it predicts the time to the occurrence of a collision between an obstacle and the robot based on their states,and then computes the collision criticality of each obstacle with respect to the robot. Second,it generates the obstacle sequence based on the collision criticality and abstracts a fixed-dimension vector by LSTM to represent the environment. Finally,the robot state and the extracted vector are concatenated as the input of the DRL's value network to compute the value with respect to the system state. At any instant,for each action,it predicts the value of the next state based on the LSTM and DRL models and then the value of the current state; hence,the action generating the maximal value of the current state is selected to control the robot. To evaluate the performance of Crit-LSTM-DRL,it is first trained in three different environments and obtain three models: the model trained in the environment with 5 obstacles,the model trained in the environment with 10 obstacles,and the model trained in the environment with variable number of obstacles(1～10). The models then are tested in various environments containing different number of obstacles. To further investigate the effects of the interaction between an obstacle and the robot,this paper also takes the joint state of an obstacle and the robot as the state of the obstacle and trains another three models in the above training environments. Experimental results show the effectiveness and efficiency of Crit-LSTM-DRL.

trajectory planning|collision avoidance|obstacle criticality|deep reinforcement learning

Published in Jisuanji kexue

ISSN: 1002-137X (Print)
Publisher: Editorial office of Computer Science
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software; Technology: Technology (General)
Website: http://www.jsjkx.com/CN/1002-137X/home.shtml

About the journal

Abstract

Keywords