Remote Sensing (Jul 2024)
Border-Enhanced Triple Attention Mechanism for High-Resolution Remote Sensing Images and Application to Land Cover Classification
Abstract
With the continuous development and popularization of remote sensing technology, remote sensing images have been widely used in the field of land cover classification. Since remote sensing images have complex spatial structure and texture features, it is becoming a challenging problem to accurately categorize them. Land cover classification has practical application value in various fields, such as environmental monitoring and protection, urban and rural planning and management, and climate change research. In recent years, remote sensing image classification methods based on deep learning have been rapidly developed, in which semantic segmentation technology has become one of the mainstream methods for land cover classification using remote sensing image. Traditional semantic segmentation algorithms tend to ignore the edge information, resulting in poor classification of the edge part in land cover classification, and there are numerous attention mechanisms to make improvements for these problems. In this paper, a triple attention mechanism, BETAM (Border-Enhanced Triple Attention Mechanism), for edge feature enhancement of high-resolution remote sensing images is proposed. Furthermore, a new model on the basis of the semantic segmentation network model DeeplabV3+ is also introduced, which is called DeepBETAM. The triple attention mechanism BETAM is able to capture feature dependencies in three dimensions: position, space, and channel, respectively. Through feature importance weighting, modeling of spatial relationships, and adaptive learning capabilities, the model BETAM pays more attention to edge features, thus improving the accuracy of edge detection. A remote sensing image dataset SMCD (Subject Meticulous Categorization Dataset) is constructed to verify the robustness of the attention mechanism BETAM and the model DeepBETAM. Extensive experiments were conducted on the two self-built datasets FRSID and SMCD. Experimental results showed that the mean Intersection over Union (mIoU), mean Pixel Accuracy (mPA), and mean Recall (mRecall) of DeepBETAM are 63.64%, 71.27%, and 71.31%, respectively. These metrics are superior to DeeplabV3+, DeeplabV3+(SENet), DeeplabV3+(CBAM), DeeplabV3+(SAM), DeeplabV3+(ECANet), and DeeplabV3+(CAM), which are network models that incorporate different attention mechanisms. The reason is that BETAM has better edge segmentation results and segmentation accuracy. Meanwhile, on the basis of the self-built dataset, the four main classifications of buildings, cultivated land, water bodies and vegetation were subdivided and detected, and good experimental results were obtained, which verified the robustness of the attention mechanism BETAM and the model DeepBETAM. The method has broad application prospects and can provide favorable support for research and application in the field of surface classification.
Keywords