Deep reinforcement learning-based rehabilitation robot trajectory planning with optimized reward functions

Xusheng Wang; Jiexin Xie; Shijie Guo; Yue Li; Pengfei Sun; Zhongxue Gan

doi:10.1177/16878140211067011

Advances in Mechanical Engineering (Dec 2021)

Deep reinforcement learning-based rehabilitation robot trajectory planning with optimized reward functions

Xusheng Wang,
Jiexin Xie,
Shijie Guo,
Yue Li,
Pengfei Sun,
Zhongxue Gan

Affiliations

Xusheng Wang: Academy for Engineering and Technology, Fudan University, Shanghai, China
Jiexin Xie: Academy for Engineering and Technology, Fudan University, Shanghai, China
Shijie Guo: Academy for Engineering and Technology, Fudan University, Shanghai, China
Yue Li: Department of Computer Technology, Hebei College of Industry and Technology, Shijiazhuang, China
Pengfei Sun: Beijing Smartchip Microelectronics Technology Company Limited, Beijing, China
Zhongxue Gan: Academy for Engineering and Technology, Fudan University, Shanghai, China

DOI: https://doi.org/10.1177/16878140211067011
Journal volume & issue: Vol. 13

Abstract

Read online

Deep reinforcement learning (DRL) provides a new solution for rehabilitation robot trajectory planning in the unstructured working environment, which can bring great convenience to patients. Previous researches mainly focused on optimization strategies but ignored the construction of reward functions, which leads to low efficiency. Different from traditional sparse reward function, this paper proposes two dense reward functions. First, azimuth reward function mainly provides a global guidance and reasonable constraints in the exploration. To further improve the efficiency, a process-oriented aspiration reward function is proposed, it is capable of accelerating the exploration process and avoid locally optimal solution. Experiments show that the proposed reward functions are able to accelerate the convergence rate by 38.4% on average with the mainstream DRL methods. The mean of convergence also increases by 9.5%, and the percentage of standard deviation decreases by 21.2%–23.3%. Results show that the proposed reward functions can significantly improve learning efficiency of DRL methods, and then provide practical possibility for automatic trajectory planning of rehabilitation robot.

Published in Advances in Mechanical Engineering

ISSN: 1687-8132 (Print); 1687-8140 (Online)
Publisher: SAGE Publishing
Country of publisher: United Kingdom
LCC subjects: Technology: Mechanical engineering and machinery
Website: https://journals.sagepub.com/home/ade

About the journal