IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2025)
A Spatiotemporal Fusion Network for Remote Sensing Based on Global Context Attention Mechanism
Abstract
Spatial-temporal fusion algorithms commonly encounter difficulties in effectively striking a balance between the extraction of intricate spatial details and changes over time. To mitigate these problems, we propose a spatiotemporal fusion network for remote sensing based on a global context (GC) attention mechanism. This network comprises an extracting feature network and a difference network. In the difference network, we introduce a GC attention mechanism that focuses on crucial details across various features, assuming different roles in different modules. A GC attention-based U-block (GCAU) employs a U-structured design and integrates GC attention mechanisms into every layer of its architecture. This enables the module to effectively process regions with pronounced spatial heterogeneity and to adeptly capture spatial differential information. A GC layer (GCL) block comprises five interconnected GC attention blocks interspersed with local residuals. During the training process, these residuals assist in rectifying missing data and bolstering feature transfer, thereby enabling the module to more effectively capture temporally dynamic changes in rapidly evolving areas. This allows the module to better capture time-based changes in fast-changing areas. Supervision of both the final output and intermediate difference images is facilitated by a composite loss function, which improves the fusion quality in the temporal, spatial, and visual domains. The model's robustness and superiority are validated through experimental testing on three datasets, accompanied by subjective and objective evaluations, as well as ablation experiments.
Keywords