International Journal of Applied Earth Observations and Geoinformation (Feb 2025)
An enhanced network for extracting tunnel lining defects using transformer encoder and aggregate decoder
Abstract
The tunnel environment is characterized by insufficient ambient light, obstructed view, and complex inner lining construction conditions. These factors frequently result in limited anti-interference capability, reduced recognition accuracy, and suboptimal segmentation results for defect extraction. We propose a deep network model utilizing an encoder–decoder framework that integrates Transformer and convolution for comprehensive defect extraction. The proposed model utilizes an encoder that integrates a hierarchical Transformer backbone with an efficient attention mechanism to fully explore complete information at multi-scale granularities. In the decoder, multi-scale information is initially aggregated using a Multi-Layer Perceptron (MLP) module. Additionally, the Stacking Filters with Atrous Convolutions (SFAC) module are implemented to enhance the perception of the complete defect scope. Furthermore, a Boundary-aware Attention Module (BAM) is implemented to enhance edge information to improve the detection of defects. With this well-designed decoder, the multi-scale information from the encoder can be fully aggregated and exploited for complete defect detection. Experimental findings illustrate the effectiveness of our proposed approach in addressing tunnel lining defects within the image dataset. The outcomes reveal that our proposed network achieves an accuracy (Acc) of 94.4% and a mean intersection over union (mIoU) of 78.14%. Compared to state-of-the-art segmentation networks, our model improves the accuracy of tunnel lining defect extraction, showcasing enhanced extraction effectiveness and anti-interference capability, thus meeting the engineering requirements for defect detection in complex environments of tunnels.