Adaptive Quadruped Balance Control for Dynamic Environments Using Maximum-Entropy Reinforcement Learning

Haoran Sun; Tingting Fu; Yuanhuai Ling; Chaoming He

doi:10.3390/s21175907

Sensors (Sep 2021)

Adaptive Quadruped Balance Control for Dynamic Environments Using Maximum-Entropy Reinforcement Learning

Haoran Sun,
Tingting Fu,
Yuanhuai Ling,
Chaoming He

Affiliations

Haoran Sun: School of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, China
Tingting Fu: School of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, China
Yuanhuai Ling: School of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, China
Chaoming He: School of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, China

DOI: https://doi.org/10.3390/s21175907
Journal volume & issue: Vol. 21, no. 17
p. 5907

Abstract

Read online

External disturbance poses the primary threat to robot balance in dynamic environments. This paper provides a learning-based control architecture for quadrupedal self-balancing, which is adaptable to multiple unpredictable scenes of external continuous disturbance. Different from conventional methods which construct analytical models which explicitly reason the balancing process, our work utilized reinforcement learning and artificial neural network to avoid incomprehensible mathematical modeling. The control policy is composed of a neural network and a Tanh Gaussian policy, which implicitly establishes the fuzzy mapping from proprioceptive signals to action commands. During the training process, the maximum-entropy method (soft actor-critic algorithm) is employed to endow the policy with powerful exploration and generalization ability. The trained policy is validated in both simulations and realistic experiments with a customized quadruped robot. The results demonstrate that the policy can be easily transferred to the real world without elaborate configurations. Moreover, although this policy is trained in merely one specific vibration condition, it demonstrates robustness under conditions that were never encountered during training.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords