Remote Sensing (Jan 2021)

Deep Learning Based Thin Cloud Removal Fusing Vegetation Red Edge and Short Wave Infrared Spectral Information for Sentinel-2A Imagery

  • Jun Li,
  • Zhaocong Wu,
  • Zhongwen Hu,
  • Zilong Li,
  • Yisong Wang,
  • Matthieu Molinier

DOI
https://doi.org/10.3390/rs13010157
Journal volume & issue
Vol. 13, no. 1
p. 157

Abstract

Read online

Thin clouds seriously affect the availability of optical remote sensing images, especially in visible bands. Short-wave infrared (SWIR) bands are less influenced by thin clouds, but usually have lower spatial resolution than visible (Vis) bands in high spatial resolution remote sensing images (e.g., in Sentinel-2A/B, CBERS04, ZY-1 02D and HJ-1B satellites). Most cloud removal methods do not take advantage of the spectral information available in SWIR bands, which are less affected by clouds, to restore the background information tainted by thin clouds in Vis bands. In this paper, we propose CR-MSS, a novel deep learning-based thin cloud removal method that takes the SWIR and vegetation red edge (VRE) bands as inputs in addition to visible/near infrared (Vis/NIR) bands, in order to improve cloud removal in Sentinel-2 visible bands. Contrary to some traditional and deep learning-based cloud removal methods, which use manually designed rescaling algorithm to handle bands at different resolutions, CR-MSS uses convolutional layers to automatically process bands at different resolution. CR-MSS has two input/output branches that are designed to process Vis/NIR and VRE/SWIR, respectively. Firstly, Vis/NIR cloudy bands are down-sampled by a convolutional layer to low spatial resolution features, which are then concatenated with the corresponding features extracted from VRE/SWIR bands. Secondly, the concatenated features are put into a fusion tunnel to down-sample and fuse the spectral information from Vis/NIR and VRE/SWIR bands. Third, a decomposition tunnel is designed to up-sample and decompose the fused features. Finally, a transpose convolutional layer is used to up-sample the feature maps to the resolution of input Vis/NIR bands. CR-MSS was trained on 28 real Sentinel-2A image pairs over the globe, and tested separately on eight real cloud image pairs and eight simulated cloud image pairs. The average SSIM values (Structural Similarity Index Measurement) for CR-MSS results on Vis/NIR bands over all testing images were 0.69, 0.71, 0.77, and 0.81, respectively, which was on average 1.74% higher than the best baseline method. The visual results on real Sentinel-2 images demonstrate that CR-MSS can produce more realistic cloud and cloud shadow removal results than baseline methods.

Keywords