DSC-Net: Enhancing Blind Road Semantic Segmentation with Visual Sensor Using a Dual-Branch Swin-CNN Architecture

Ying Yuan; Yu Du; Yan Ma; Hejun Lv

doi:10.3390/s24186075

Sensors (Sep 2024)

DSC-Net: Enhancing Blind Road Semantic Segmentation with Visual Sensor Using a Dual-Branch Swin-CNN Architecture

Ying Yuan,
Yu Du,
Yan Ma,
Hejun Lv

Affiliations

Ying Yuan: Beijing Key Laboratory of Information Service Engineering, College of Robotics, Beijing Union University, Beijing 100101, China
Yu Du: Beijing Key Laboratory of Information Service Engineering, College of Robotics, Beijing Union University, Beijing 100101, China
Yan Ma: Beijing Key Laboratory of Information Service Engineering, College of Robotics, Beijing Union University, Beijing 100101, China
Hejun Lv: Beijing Key Laboratory of Information Service Engineering, College of Robotics, Beijing Union University, Beijing 100101, China

DOI: https://doi.org/10.3390/s24186075
Journal volume & issue: Vol. 24, no. 18
p. 6075

Abstract

Read online

In modern urban environments, visual sensors are crucial for enhancing the functionality of navigation systems, particularly for devices designed for visually impaired individuals. The high-resolution images captured by these sensors form the basis for understanding the surrounding environment and identifying key landmarks. However, the core challenge in the semantic segmentation of blind roads lies in the effective extraction of global context and edge features. Most existing methods rely on Convolutional Neural Networks (CNNs), whose inherent inductive biases limit their ability to capture global context and accurately detect discontinuous features such as gaps and obstructions in blind roads. To overcome these limitations, we introduce Dual-Branch Swin-CNN Net(DSC-Net), a new method that integrates the global modeling capabilities of the Swin-Transformer with the CNN-based U-Net architecture. This combination allows for the hierarchical extraction of both fine and coarse features. First, the Spatial Blending Module (SBM) mitigates blurring of target information caused by object occlusion to enhance accuracy. The hybrid attention module (HAM), embedded within the Inverted Residual Module (IRM), sharpens the detection of blind road boundaries, while the IRM improves the speed of network processing. In tests on a specialized dataset designed for blind road semantic segmentation in real-world scenarios, our method achieved an impressive mIoU of 97.72%. Additionally, it demonstrated exceptional performance on other public datasets.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords