IEEE Access (Jan 2023)

Reward for Exploration Based on View Synthesis

  • Shun Taguchi,
  • Satoshi Koide,
  • Kazuki Shibata,
  • Hideki Deguchi

DOI
https://doi.org/10.1109/ACCESS.2023.3326883
Journal volume & issue
Vol. 11
pp. 118830 – 118840

Abstract

Read online

Research on embodied-AI has flourished in recent years to make AI accessible to real-world information. Visual exploration is a very fundamental task in embodied-AI applications such as object-goal navigation, embodied questioning and answering (EQA), and rearrangement. However, it is still a challenging task. The frontier-based method is successful but it is difficult to use for reinforcement learning (RL). Moreover, it heavily relies on two-dimensional grid-map representation, therefore difficult to apply free movement in three-dimensional environments. We propose a novel reward for RGB-D camera-based exploration to maximize the amount of new information contained in the observations obtained from the camera. The basic idea of our method is to predict the destination image by view synthesis using a point cloud obtained by back-projecting depth information. The more lacks in this predicted image, the more likely it is to contain unknown information. For efficient exploration, we also propose topological map implementation to prevent the agent from repetitively visiting the same states. Our method achieves a performance of coverage of area, objects and landmarks comparable to that of state-of-the-art visual exploration methods without using two-dimensional grid maps. Furthermore, we implement object-goal navigation through integration of object detection and simple point-goal navigation, and it outperforms the task-specific RL method with the same architecture on the success rate.

Keywords