A Collaborative Control Method of Dual-Arm Robots Based on Deep Reinforcement Learning

Luyu Liu; Qianyuan Liu; Yong Song; Bao Pang; Xianfeng Yuan; Qingyang Xu

doi:10.3390/app11041816

Applied Sciences (Feb 2021)

A Collaborative Control Method of Dual-Arm Robots Based on Deep Reinforcement Learning

Luyu Liu,
Qianyuan Liu,
Yong Song,
Bao Pang,
Xianfeng Yuan,
Qingyang Xu

Affiliations

Luyu Liu: School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai 264200, China
Qianyuan Liu: School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai 264200, China
Yong Song: School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai 264200, China
Bao Pang: School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai 264200, China
Xianfeng Yuan: School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai 264200, China
Qingyang Xu: School of Mechanical, Electrical & Information Engineering, Shandong University, Weihai 264200, China

DOI: https://doi.org/10.3390/app11041816
Journal volume & issue: Vol. 11, no. 4
p. 1816

Abstract

Read online

Collaborative control of a dual-arm robot refers to collision avoidance and working together to accomplish a task. To prevent the collision of two arms, the control strategy of a robot arm needs to avoid competition and to cooperate with the other one during motion planning. In this paper, a dual-arm deep deterministic policy gradient (DADDPG) algorithm is proposed based on deep reinforcement learning of multi-agent cooperation. Firstly, the construction method of a replay buffer in a hindsight experience replay algorithm is introduced. The modeling and training method of the multi-agent deep deterministic policy gradient algorithm is explained. Secondly, a control strategy is assigned to each robotic arm. The arms share their observations and actions. The dual-arm robot is trained based on a mechanism of “rewarding cooperation and punishing competition”. Finally, the effectiveness of the algorithm is verified in the Reach, Push, and Pick up simulation environment built in this study. The experiment results show that the robot trained by the DADDPG algorithm can achieve cooperative tasks. The algorithm can make the robots explore the action space autonomously and reduce the level of competition with each other. The collaborative robots have better adaptability to coordination tasks.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords