Semantic and Context Information Fusion Network for View-Based 3D Model Classification and Retrieval

An-An Liu; Fu-Bin Guo; He-Yu Zhou; Wen-Hui Li; Dan Song

doi:10.1109/ACCESS.2020.3018875

IEEE Access (Jan 2020)

Semantic and Context Information Fusion Network for View-Based 3D Model Classification and Retrieval

An-An Liu,
Fu-Bin Guo,
He-Yu Zhou,
Wen-Hui Li,
Dan Song

Affiliations

An-An Liu: ORCiD; School of Electrical and Information Engineering, Tianjin University, Tianjin, China
Fu-Bin Guo: ORCiD; School of Electrical and Information Engineering, Tianjin University, Tianjin, China
He-Yu Zhou: ORCiD; School of Electrical and Information Engineering, Tianjin University, Tianjin, China
Wen-Hui Li: ORCiD; School of Electrical and Information Engineering, Tianjin University, Tianjin, China
Dan Song: ORCiD; School of Electrical and Information Engineering, Tianjin University, Tianjin, China

DOI: https://doi.org/10.1109/ACCESS.2020.3018875
Journal volume & issue: Vol. 8
pp. 155939 – 155950

Abstract

Read online

In recent years, with the rapid development of 3D technology, view-based methods have shown excellent performance in both 3D model classification and retrieval tasks. In view-based methods, how to aggregate multi-view features is a key issue. There are two commonly used solutions in the existing methods: 1) Use pooling strategy to merge multi-view features, but it ignores the context information contained in the continuous view sequence. 2) Leverage grouping strategy or long short term memory networks (LSTM) to select representative views of the 3D model, however, it easily neglects the semantic information of individual views. In this paper, we propose a novel Semantic and Context information Fusion Network (SCFN) to compensate for these drawbacks. First, we render views from multiple perspectives of the 3D model and extract the raw feature of the individual view by 2D convolutional neural networks (CNN). Then we design the channel attention mechanism (CAM) to exploit the view-wise semantic information. By modeling the correlation among view feature channels, we can assign higher weights to useful feature attributes, while suppressing the useless. Next, we propose a context information fusion module (CFM) to fuse multiple view features to obtain a compact 3D representation. Extensive experiments are conducted on three popular datasets, i.e., ModelNet10, ModelNet40, and ShapeNetCore55, which can demonstrate the superiority of the proposed method comparing to the state-of-the-arts on both 3D classification and retrieval tasks.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords