Reinforcement Learning of Bipedal Walking Using a Simple Reference Motion

Naoya Itahashi; Hideaki Itoh; Hisao Fukumoto; Hiroshi Wakuya

doi:10.3390/app14051803

Applied Sciences (Feb 2024)

Reinforcement Learning of Bipedal Walking Using a Simple Reference Motion

Naoya Itahashi,
Hideaki Itoh,
Hisao Fukumoto,
Hiroshi Wakuya

Affiliations

Naoya Itahashi: Electrical and Electronic Engineering Course, Graduate School of Science and Engineering, Saga University, Saga 840-8502, Japan
Hideaki Itoh: Department of Electrical and Electronic Engineering, Faculty of Science and Engineering, Saga University, Saga 840-8502, Japan
Hisao Fukumoto: Department of Electrical and Electronic Engineering, Faculty of Science and Engineering, Saga University, Saga 840-8502, Japan
Hiroshi Wakuya: Integrated Center for Educational Research and Development, Faculty of Education, Saga University, Saga 840-8502, Japan

DOI: https://doi.org/10.3390/app14051803
Journal volume & issue: Vol. 14, no. 5
p. 1803

Abstract

Read online

In this paper, a novel reinforcement learning method that enables a humanoid robot to learn bipedal walking using a simple reference motion is proposed. Reinforcement learning has recently emerged as a useful method for robots to learn bipedal walking, but, in many studies, a reference motion is necessary for successful learning, and it is laborious or costly to prepare a reference motion. To overcome this problem, our proposed method uses a simple reference motion consisting of three sine waves and automatically sets the waveform parameters using Bayesian optimization. Thus, the reference motion can easily be prepared with minimal human involvement. Moreover, we introduce two means to facilitate reinforcement learning: (1) we combine reinforcement learning with inverse kinematics (IK), and (2) we use the reference motion as a bias for the action determined via reinforcement learning, rather than as an imitation target. Through numerical experiments, we show that our proposed method enables bipedal walking to be learned based on a small number of samples. Furthermore, we conduct a zero-shot sim-to-real transfer experiment using a domain randomization method and demonstrate that a real humanoid robot, KHR-3HV, can walk with the controller acquired using the proposed method.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords