A Robust End-to-End Speckle Stereo Matching Network for Industrial Scenes

Yunxuan Liu; Kai Yang; Xinyu Li; Zijian Bai; Yingying Wan; Liming Xie

doi:10.1109/ACCESS.2024.3352136

IEEE Access (Jan 2024)

A Robust End-to-End Speckle Stereo Matching Network for Industrial Scenes

Yunxuan Liu,
Kai Yang,
Xinyu Li,
Zijian Bai,
Yingying Wan,
Liming Xie

Affiliations

Yunxuan Liu: ORCiD; School of Physical Science and Technology, Southwest Jiaotong University, Chengdu, China
Kai Yang: ORCiD; School of Physical Science and Technology, Southwest Jiaotong University, Chengdu, China
Xinyu Li: School of Physical Science and Technology, Southwest Jiaotong University, Chengdu, China
Zijian Bai: School of Physical Science and Technology, Southwest Jiaotong University, Chengdu, China
Yingying Wan: School of Physical Science and Technology, Southwest Jiaotong University, Chengdu, China
Liming Xie: School of Physical Science and Technology, Southwest Jiaotong University, Chengdu, China

DOI: https://doi.org/10.1109/ACCESS.2024.3352136
Journal volume & issue: Vol. 12
pp. 6777 – 6789

Abstract

Read online

The detection capability of deep learning-based stereo matching in industrial applications is inherently limited due to challenges posed by weak texture and inconsistent reflectance, making it difficult to accurately recover complex surface details. To achieve accurate measurements, this paper presents an end-to-end speckle stereo matching network that incorporates fringe, Gray code, and speckle projection patterns. The model is trained using a high-precision dataset consisting of thousands of pairs generated through binocular Gray code-assisted phase shifting. After establishing local correspondences between the left and right images using speckle patterns, the images are used as inputs to the network. The proposed network consists of two siamese 2D feature extraction networks. One network is dedicated to cost volume computation, while the other focuses on weight refinement feature extraction. The former network incorporates a lightweight module for extracting high-dimensional fusion features. These features are obtained from different dilation scales and randomly concatenated along the channel dimension. Patch convolution is utilized to effectively adapt to pixel features at various levels, reducing redundancy within the cost volume and improving the network’s capacity to learn from ill-posed regions. Experimental results demonstrate that the proposed network achieves a significant improvement of approximately 10.7% in matching accuracy compared to state-of-the-art networks on public datasets. Furthermore, this method exhibits outstanding matching results when applied to diverse industrial scenarios. The reconstruction error for the radius of optical standard spheres is below 0.06-mm, which meets the demands of the majority of industrial applications.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords