Nihon Kikai Gakkai ronbunshu (Mar 2023)

Recalling Unknown Manipulations by Spontaneously Sharing Actions with Similar Objects in Observation Based Learning

  • Makoto SANADA,
  • Tadashi MATSUO,
  • Nobutaka SHIMADA,
  • Yoshiaki SHIRAI

DOI
https://doi.org/10.1299/transjsme.22-00274
Journal volume & issue
Vol. 89, no. 920
pp. 22-00274 – 22-00274

Abstract

Read online

This paper proposes a method for a robot to recall multiple action candidates for an object by learning object manipulations based on observation of human actions. When learning, multiple answers to a single input in supervised regression manner, it is usually necessary to map all correct answers to the same input. However, only one action can be observed for an object at a time in observing object manipulations, and other possible actions are not always observed for the identical object. It is, therefore, important to automatically share various observed actions between similar-shaped objects by recognizing common shape cues among individual objects. The proposed method learns the code descriptions of object shapes by a variational auto-encoder (VAE) with an object image as input data, and the code descriptions of actions by a conditional VAE with object shape as a condition and an action as input data. Since the action is unknown recall target, it is desirable to obtain the code description of the action from only the object shape during recalling. The distribution of the code description of actions conditioned by input object shape on the obtained code description space is obtained by marginalization of the distribution learned by the encoder part of CVAE. However, since this marginalization is difficult to analytically and numerically operate, a deep regression model that “imitates” this marginal distribution is trained by using a maximum likelihood method based on sampling. Common actions of similar-shaped objects are shared among the similar objects in this “marginalization by imitation” process. Various possible actions for the input object shape can be recalled by repeatedly sampling from the imitated marginal distribution. This paper describes the results of experiment using actual object images and manipulation actions, and demonstrates the effectiveness of the proposed method.

Keywords