International Journal of Applied Earth Observations and Geoinformation (Dec 2022)
MUSTFN: A spatiotemporal fusion method for multi-scale and multi-sensor remote sensing images based on a convolutional neural network
Abstract
Spatiotemporal data fusion is a commonly-used and well-proven technique to enhance the application potential of multi-source remote sensing images. However, most existing methods have trouble in generating quality fusion results when areas covered by the images undergoes rapid land cover changes or images have substantial registration errors. While deep learning algorithms have demonstrated their capabilities for imagery fusion, it is challenging to apply deep-learning-based fusion methods in regions that experiences persistent cloud covers and have limited cloud-free imagery observations. To address these challenges, we developed a Multi-scene Spatiotemporal Fusion Network (MUSTFN) algorithm based on a Convolutional Neural Network (CNN). Our approach uses multi-level features to fuse images at different resolutions acquired by multiple sensors. Furthermore, MUSTFN uses the multi-scale features to overcome the effects of geometric registration errors between different images. Additionally, a multi-constrained loss function is proposed to improve the accuracy of imagery fusion over large areas and solve fusion and gap-filling problems simultaneously by utilizing cloud-contaminated images with the fine-tuning method. Compared with several commonly-used methods, our proposed MUSTFN performs better in fusing the 30-m Landsat-7 images and 500-m MODIS images over a small area that has undergone large changes (the average relative Mean Absolute Errors (rMAE) of the first four bands are 6.8% by MUSTFN as compared to 14.1% by the Enhanced Spatial and Temporal Adaptive Reflectance Fusion Model (ESTARFM), 12.8% by the Flexible Spatiotemporal Data Fusion (FSDAF), 8.4% by the Extended Super-Resolution Convolutional Neural Network (ESRCNN), 8.1% by the Spatiotemporal Fusion Using a Generative Adversarial Network (STFGAN)). In particularly for images at different resolutions with different registration accuracies (e.g., 16-m Chinese GaoFen-1 and 500-m MODIS), MUSTFN achieved fusion results of good quality with an average rMAE of 9.3% in spectral reflectance at the first four bands. Finally, we demonstrated the applicability of MUSTFN (average rMAE of 9.18%) when fusing long-term Landsat-8 composite images and MODIS images over a large region (830 km × 600 km). Overall, our results suggest the effectiveness of MUSTFN to address the challenges in imagery fusion, including rapid land cover changes between image acquisition dates, geometric misregistration between images and limited availabilities of cloud-free images. The program of MUSTFN is freely available at: https://github.com/qpyeah/MUSTFN.