Hierarchical DDPG for Manipulator Motion Planning in Dynamic Environments

Dugan Um; Prasad Nethala; Hocheol Shin

doi:10.3390/ai3030037

AI (Aug 2022)

Hierarchical DDPG for Manipulator Motion Planning in Dynamic Environments

Dugan Um,
Prasad Nethala,
Hocheol Shin

Affiliations

Dugan Um: School of Engineering & Computing Sciences, Texas A&M University-Corpus Christi, TX 78412, USA
Prasad Nethala: Geospatial Systems Engineering, Texas A&M University, Corpus Christi, TX 78412, USA
Hocheol Shin: Nuclear Robot and Diagnosis Team, Korea Atomic Energy Research Institute, Daejeon 34057, Korea

DOI: https://doi.org/10.3390/ai3030037
Journal volume & issue: Vol. 3, no. 3
pp. 645 – 658

Abstract

Read online

In this paper, a hierarchical reinforcement learning (HRL) architecture, namely a “Hierarchical Deep Deterministic Policy Gradient (HDDPG)” has been proposed and studied. A HDDPG utilizes manager and worker formation similar to other HRL structures. However, unlike others, the HDDPG enables sharing an identical environment and state among workers and managers, while a unique reward system is required for each Deep Deterministic Policy Gradient (DDPG) agent. Therefore, the HDDPG allows easy structural expansion with probabilistic action selection of a worker by the manager. Due to its innate structural advantage, the HDDPG has a merit in building a general AI to deal with a complex time-horizon tasks with various conflicting sub-goals. The experimental results demonstrated its usefulness with a manipulator motion planning problem in a dynamic environment, where path planning and collision avoidance conflict each other. The proposed HDDPG is compared with an HAM and a single DDPG for performance evaluation. The result shows that the HDDPG demonstrated more than 40% of reward gain and more than two times the reward improvement rate. Another important feature of the proposed HDDPG is the biased manager training capability. By adding a preference factor to each worker, the manager can be trained to prefer a certain worker to achieve better success rate for a specific objective if needed.

Published in AI

ISSN: 2673-2688 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.mdpi.com/journal/ai

About the journal

Abstract

Keywords