IEEE Access (Jan 2023)

A Hierarchical Robot Learning Framework for Manipulator Reactive Motion Generation via Multi-Agent Reinforcement Learning and Riemannian Motion Policies

  • Yuliu Wang,
  • Ryusuke Sagawa,
  • Yusuke Yoshiyasu

DOI
https://doi.org/10.1109/ACCESS.2023.3324039
Journal volume & issue
Vol. 11
pp. 126979 – 126994

Abstract

Read online

Manipulators motion planning faces new challenges as robots are increasingly used in dense, cluttered and dynamic environments. The recently proposed technique called Riemannian motion policies(RMPs) provides an elegant solution with clear mathematical interpretations to such challenging scenarios. It is based on differential geometry policies that generate reactive motions in dynamic environments with real-time performance. However, designing and combining RMPs is still a difficult task involving extensive parameter tuning, and typically seven or more RMPs need to be combined by using RMPflow to realize motions of a robot manipulator with more than 6 degrees-of-freedoms, where the RMPs parameters have to be empirically set each time. In this paper, we take a policy to decompose such complex policies into multiple learning modules based on reinforcement learning. Specifically, we propose a three-layer robot learning framework that consists of the basic-level, middle-level and top-level layers. At the basic layer, only two base RMPs i.e. target and collision avoidance are used to output reactive actions. At the middle-level layer, a hierarchical reinforcement learning approach is used to train an agent that automatically selects those RMPs and their parameters based on environmental changes and will be deployed at each joint. At the top-level layer, a multi-agent reinforcement learning approach trains all the joints with high-level collaborative policies to accomplish actions such as tracking a target and avoiding obstacles. With simulation experiments, we compare the proposed method with the baseline method and find that our method effectively produces superior actions and is better at avoiding obstacles, handling self-collisions, and avoiding singularities in dynamic environments. In addition, the proposed framework possesses higher training efficiency while leveraging the generalization ability of reinforcement learning to dynamic environments and improving safety and interpretability.

Keywords