IEEE Access (Jan 2022)
Sample-Efficient Training of Robotic Guide Using Human Path Prediction Network
Abstract
Training a robot that engages with people is challenging; it is expensive to directly involve people in the training process, which requires numerous data samples. This paper presents an alternative approach for resolving this problem. We propose a human path prediction network (HPPN) that generates a user’s future trajectory based on sequential robot actions and human responses using a recurrent-neural-network structure. Subsequently, an evolution-strategy-based robot training method using only the virtual human movements generated using the HPPN is presented. It is demonstrated that our proposed method permits sample-efficient training of a robotic guide for visually impaired people. By collecting only 1.5 K episodes from real users, we were able to train the HPPN and generate more than 100 K virtual episodes required for training the robot. The trained robot precisely guided blindfolded participants along a target path. Furthermore, using virtual episodes, we investigated a new reward design that prioritizes human comfort during the robot’s guidance without incurring additional costs. This sample-efficient training method is expected to be widely applicable to future robots that interact physically with humans.
Keywords