Recalling Unknown Manipulations by Spontaneously Sharing Actions with Similar Objects in Observation Based Learning

Makoto SANADA; Tadashi MATSUO; Nobutaka SHIMADA; Yoshiaki SHIRAI

doi:10.1299/transjsme.22-00274

Nihon Kikai Gakkai ronbunshu (Mar 2023)

Recalling Unknown Manipulations by Spontaneously Sharing Actions with Similar Objects in Observation Based Learning

Makoto SANADA,
Tadashi MATSUO,
Nobutaka SHIMADA,
Yoshiaki SHIRAI

Affiliations

Makoto SANADA: Graduate School of Information Science and Engineering, Ritsumeikan University
Tadashi MATSUO: College of Information Science and Engineering, Ritsumeikan University
Nobutaka SHIMADA: College of Information Science and Engineering, Ritsumeikan University
Yoshiaki SHIRAI: College of Information Science and Engineering, Ritsumeikan University

DOI: https://doi.org/10.1299/transjsme.22-00274
Journal volume & issue: Vol. 89, no. 920
pp. 22-00274 – 22-00274

Abstract

Read online

This paper proposes a method for a robot to recall multiple action candidates for an object by learning object manipulations based on observation of human actions. When learning, multiple answers to a single input in supervised regression manner, it is usually necessary to map all correct answers to the same input. However, only one action can be observed for an object at a time in observing object manipulations, and other possible actions are not always observed for the identical object. It is, therefore, important to automatically share various observed actions between similar-shaped objects by recognizing common shape cues among individual objects. The proposed method learns the code descriptions of object shapes by a variational auto-encoder (VAE) with an object image as input data, and the code descriptions of actions by a conditional VAE with object shape as a condition and an action as input data. Since the action is unknown recall target, it is desirable to obtain the code description of the action from only the object shape during recalling. The distribution of the code description of actions conditioned by input object shape on the obtained code description space is obtained by marginalization of the distribution learned by the encoder part of CVAE. However, since this marginalization is difficult to analytically and numerically operate, a deep regression model that “imitates” this marginal distribution is trained by using a maximum likelihood method based on sampling. Common actions of similar-shaped objects are shared among the similar objects in this “marginalization by imitation” process. Various possible actions for the input object shape can be recalled by repeatedly sampling from the imitated marginal distribution. This paper describes the results of experiment using actual object images and manipulation actions, and demonstrates the effectiveness of the proposed method.

Published in Nihon Kikai Gakkai ronbunshu

ISSN: 2187-9761 (Online)
Publisher: The Japan Society of Mechanical Engineers
Country of publisher: Japan
LCC subjects: Technology: Mechanical engineering and machinery; Technology: Engineering (General). Civil engineering (General): Engineering machinery, tools, and implements
Website: https://www.jsme.or.jp/publish/transact/

About the journal

Abstract

Keywords