IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2023)
Capturing Small Objects and Edges Information for Cross-Sensor and Cross-Region Land Cover Semantic Segmentation in Arid Areas
Abstract
In the oasis area adjacent to the desert, there is more complex land cover information with rich details, multiscales of interest objects, and blur edge information, which poses some challenges to the semantic segmentation task in remote sensing images (RSIs). In traditional semantic segmentation methods, detailed spatial information is more likely lost in feature extraction stage and the global context information is more effectively integrated into segmentation results. To overcome these land cover semantic segmentation model, FPN_PSA_DLV3+ network, is proposed in an encoder–decoder manner capturing more fine edge and small objects information in RSIs. In the encoder stage, the improved atrous spatial pyramid pooling module extracts the multiscale features, especially small-scale feature details; feature pyramid network (FPN) module realizes better integration of detailed information and semantic information; and the spatial context information at both global and local levels is enhanced by introducing polarized self-attention (PSA) module. For the decoder stage, the FPN_PSA_DLV3+ network further adds a feature fusion branch to concatenate more low-level features. We select Landsat5/7/8 satellite RSIs from the areas of north and south of Xinjiang. Then, three self-annotated time-series datasets with more small objects and fine edges information are constructed by data augmentation. The experimental results show that the proposed method improves the segmentation performance of small targets and edges, and the classification performance increases from 81.55% to 83.10% F1 score and from 72.65% to 74.82% mean intersection over union only using red–green–blue bands. Meanwhile, the FPN_PSA_DLV3+ network shows great generalization in cross region and cross sensor.
Keywords