Interactive Semantic Map Representation for Skill-Based Visual Object Navigation

Tatiana Zemskova; Aleksei Staroverov; Kirill Muravyev; Dmitry A. Yudin; Aleksandr I. Panov

doi:10.1109/ACCESS.2024.3380450

IEEE Access (Jan 2024)

Interactive Semantic Map Representation for Skill-Based Visual Object Navigation

Tatiana Zemskova,
Aleksei Staroverov,
Kirill Muravyev,
Dmitry A. Yudin,
Aleksandr I. Panov

Affiliations

Tatiana Zemskova: ORCiD; Artificial Intelligence Research Institute (AIRI), Moscow, Russia
Aleksei Staroverov: ORCiD; Artificial Intelligence Research Institute (AIRI), Moscow, Russia
Kirill Muravyev: ORCiD; Federal Research Center “Computer Science and Control,”, Moscow, Russia
Dmitry A. Yudin: Artificial Intelligence Research Institute (AIRI), Moscow, Russia
Aleksandr I. Panov: ORCiD; Artificial Intelligence Research Institute (AIRI), Moscow, Russia

DOI: https://doi.org/10.1109/ACCESS.2024.3380450
Journal volume & issue: Vol. 12
pp. 44628 – 44639

Abstract

Read online

Visual object navigation is one of the key tasks in mobile robotics. One of the most important components of this task is the accurate semantic representation of the scene, which is needed to determine and reach a goal object. This paper introduces a new representation of a scene semantic map formed during the embodied agent interaction with the indoor environment. It is based on a neural network method that adjusts the weights of the segmentation model with backpropagation of the predicted fusion loss values during inference on a regular (backward) or delayed (forward) image sequence. We implement this representation into a full-fledged navigation approach called SkillTron. The method can select robot skills from end-to-end policies based on reinforcement learning and classic map-based planning methods. The proposed approach makes it possible to form both intermediate goals for robot exploration and the final goal for object navigation. We conduct intensive experiments with the proposed approach in the Habitat environment, demonstrating its significant superiority over state-of-the-art approaches in terms of navigation quality metrics. The developed code and custom datasets are publicly available at github.com/AIRI-Institute/ skill-fusion.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords