Multi-Attention Network for Stereo Matching

Xiaowei Yang; Lin He; Yong Zhao; Haiwei Sang; Zu Liu Yang; Xian Jing Cheng

doi:10.1109/ACCESS.2020.3003375

IEEE Access (Jan 2020)

Multi-Attention Network for Stereo Matching

Xiaowei Yang,
Lin He,
Yong Zhao,
Haiwei Sang,
Zu Liu Yang,
Xian Jing Cheng

Affiliations

Xiaowei Yang: ORCiD; School of Mechanical Engineering, Guizhou University, Guiyang, China
Lin He: School of Mechanical Engineering, Guizhou University, Guiyang, China
Yong Zhao: School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen, China
Haiwei Sang: ORCiD; School of Mathematics and Big Data, Guizhou Education University, Guiyang, China
Zu Liu Yang: School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen, China
Xian Jing Cheng: Research Institute of Qianbei Information Technology, Zunyi Normal University, Zunyi, China

DOI: https://doi.org/10.1109/ACCESS.2020.3003375
Journal volume & issue: Vol. 8
pp. 113371 – 113382

Abstract

Read online

In recent years, convolutional neural network (CNN) algorithms promote the development of stereo matching and make great progress, but some mismatches still occur in textureless, occluded and reflective regions. In feature extraction and cost aggregation, CNNs will greatly improve the accuracy of stereo matching by utilizing global context information and high-quality feature representations. In this paper, we design a novel end-to-end stereo matching algorithm named Multi-Attention Network (MAN). To obtain the global context information in detail at the pixel-level, we propose a Multi-Scale Attention Module (MSAM), combining a spatial pyramid module with an attention mechanism, when we extract the image features. In addition, we introduce a feature refinement module (FRM) and a 3D attention aggregation module (3D AAM) during cost aggregation so that the network can extract informative features with high representational ability and high-quality channel attention vectors. Finally, we obtain the final disparity through bilinear interpolation and disparity regression. We evaluate our method on the Scene Flow, KITTI 2012 and KITTI 2015 stereo datasets. The experimental results show that our method achieves state-of-the-art performance and that every component of our network is effective.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords