Semantic Segmentation Network Based on Adaptive Attention and Deep Fusion Utilizing a Multi-Scale Dilated Convolutional Pyramid

Shan Zhao; Zihao Wang; Zhanqiang Huo; Fukai Zhang

doi:10.3390/s24165305

Sensors (Aug 2024)

Semantic Segmentation Network Based on Adaptive Attention and Deep Fusion Utilizing a Multi-Scale Dilated Convolutional Pyramid

Shan Zhao,
Zihao Wang,
Zhanqiang Huo,
Fukai Zhang

Affiliations

Shan Zhao: School of Software, Henan Polytechnic University, Jiaozuo 454000, China
Zihao Wang: School of Software, Henan Polytechnic University, Jiaozuo 454000, China
Zhanqiang Huo: School of Software, Henan Polytechnic University, Jiaozuo 454000, China
Fukai Zhang: School of Software, Henan Polytechnic University, Jiaozuo 454000, China

DOI: https://doi.org/10.3390/s24165305
Journal volume & issue: Vol. 24, no. 16
p. 5305

Abstract

Read online

Deep learning has recently made significant progress in semantic segmentation. However, the current methods face critical challenges. The segmentation process often lacks sufficient contextual information and attention mechanisms, low-level features lack semantic richness, and high-level features suffer from poor resolution. These limitations reduce the model’s ability to accurately understand and process scene details, particularly in complex scenarios, leading to segmentation outputs that may have inaccuracies in boundary delineation, misclassification of regions, and poor handling of small or overlapping objects. To address these challenges, this paper proposes a Semantic Segmentation Network Based on Adaptive Attention and Deep Fusion with the Multi-Scale Dilated Convolutional Pyramid (SDAMNet). Specifically, the Dilated Convolutional Atrous Spatial Pyramid Pooling (DCASPP) module is developed to enhance contextual information in semantic segmentation. Additionally, a Semantic Channel Space Details Module (SCSDM) is devised to improve the extraction of significant features through multi-scale feature fusion and adaptive feature selection, enhancing the model’s perceptual capability for key regions and optimizing semantic understanding and segmentation performance. Furthermore, a Semantic Features Fusion Module (SFFM) is constructed to address the semantic deficiency in low-level features and the low resolution in high-level features. The effectiveness of SDAMNet is demonstrated on two datasets, revealing significant improvements in Mean Intersection over Union (MIOU) by 2.89% and 2.13%, respectively, compared to the Deeplabv3+ network.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords