Jisuanji kexue (Jan 2023)

Viewpoint-tolerant Scene Recognition Based on Segmentation of Sparse Point Cloud

  • HE Xionghui, TAN Jiefu, LIU Zhe, XUE Chao, YANG Shaowu, ZHANG Yongjun

DOI
https://doi.org/10.11896/jsjkx.211000118
Journal volume & issue
Vol. 50, no. 1
pp. 87 – 97

Abstract

Read online

In autonomous robot navigation,simultaneous localization and mapping is responsible for perceiving the surrounding environment and positioning itself,providing perceptual support for subsequent advanced tasks.Scene recognition,as a key mo-dule,can help the robot perceive the surrounding environment more accurately.It can correct the accumulated error caused by sensor error by identifying whether the current observation and the previous observation belong to the same scene.Existing me-thods mainly focus on scene recognition under the stable viewpoint,and judge whether two observations belong to the same scene based on the visual similarity between them.However,when the observation angle changes,there may be large visual differences in observations of the same scene,which may make the observations only partially similar,and this will lead to the failure of traditional methods.Therefore,a scene recognition method based on sparse point cloud segmentation is proposed.It divides the scene to solve local similar problems,and combines visual information and geometric information to achieve accurate scene description and ma-tching.So that the robot can recognize the same scene observation under different perspectives,which supports the loop detection for a single robot or the map fusion for multi-robot.This method divides each observation into several parts based on sparse point cloud segmentation.The segmentation result is invariant to the perspective,and each segment is extracted with a local bag of words vector and a β angle histogram to accurately describe its scene content.The former contains the visual semantic information of the scene.The latter contains the geometric structure information of the scene.Then,based on the segment,the same parts between observations are matched,the different parts are discarded to achieve accurate scene content matching and improve the success rate of place recognition.Finally,results on the public dataset show that this method outperforms the mainstream method bag of words in both stable and changing perspectives.

Keywords