Indoor 3D Semantic Robot VSLAM Based on Mask Regional Convolutional Neural Network

Chongben Tao; Zhen Gao; Jinli Yan; Chunguang Li; Guozeng Cui

doi:10.1109/ACCESS.2020.2981648

IEEE Access (Jan 2020)

Indoor 3D Semantic Robot VSLAM Based on Mask Regional Convolutional Neural Network

Chongben Tao,
Zhen Gao,
Jinli Yan,
Chunguang Li,
Guozeng Cui

Affiliations

Chongben Tao: ORCiD; Suzhou Smart City Research Institute, Suzhou University of Science and Technology, Suzhou, China
Zhen Gao: ORCiD; Faculty of Engineering, McMaster University, Hamilton, ON, Canada
Jinli Yan: ORCiD; Suzhou Smart City Research Institute, Suzhou University of Science and Technology, Suzhou, China
Chunguang Li: ORCiD; School of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou, China
Guozeng Cui: ORCiD; Suzhou Smart City Research Institute, Suzhou University of Science and Technology, Suzhou, China

DOI: https://doi.org/10.1109/ACCESS.2020.2981648
Journal volume & issue: Vol. 8
pp. 52906 – 52916

Abstract

Read online

During the construction of indoor environmental semantic maps by robot Vision SLAM (VSLAM), there exist some problems such as low label classification accuracy and low precision under the situation of sparse feature points. In this case, this paper proposes an indoor three-dimensional semantic VSLAM algorithm based on Mask Regional Convolutional Neural Network (RCNN). Firstly, an Oriented FAST and a Rotated BRIEF (ORB) algorithms are used to extract image feature points. Secondly, a Random Sample Consensus (RANSAC) algorithm is employed to eliminate mismatched points and estimate camera position-pose changes. Then, a Mask RCNN algorithm is applied to make partial adjustments to its hyper parameter. A self-made data set is used to transfer learning, fulfilling real-time target detection and instance segmentation of a scene. A three-dimensional semantic map is constructed in combination with VSLAM algorithm. The semantic information in the environment not only improves the accuracy of VSLAM construction and positioning, but also reduces the impact of object movement on the construction by marking movable objects. Meanwhile, the VSLAM algorithm is used to calculate the positional constraints between objects and improve the accuracy of semantic understanding. Finally, by comparing with other methods, it demonstrates that this method is more correct and effective. It was also verified that the proposed method can accurately interpret the semantic information in environment for the construction of three-dimensional semantic maps.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords