International Journal of Applied Earth Observations and Geoinformation (Nov 2022)

A multi-task driven and reconfigurable network for cloud detection in cloud-snow coexistence regions from very-high-resolution remote sensing images

  • Guangbin Zhang,
  • Xianjun Gao,
  • Jinhui Yang,
  • Yuanwei Yang,
  • Meilin Tan,
  • Jie Xu,
  • Yanjun Wang

Journal volume & issue
Vol. 114
p. 103070

Abstract

Read online

Cloud detection is a crucial step in remote sensing image preprocessing. Deep learning (DL) methods are currently preferred for cloud detection, and the original images need to be cropped into smaller patches for DL training and testing. However, in very-high-resolution (VHR) remote sensing cloud-snow coexistence scenes, the cropped patches have low feature richness and high inter-class feature similarity. As a result, the common DL methods are prone to misclassifying clouds in complex regions. Therefore, based on the expanded cloud detection dataset of VHR cloud-snow coexistence regions, this paper proposes a multi-task driven and reconfigurable network (MTDR-Net) to better perform the cloud detection work in the scenes. There are four modules in MTDR-Net: the high-resolution backbone with multiple projection heads module (HRB-MPH), the re-parametrizable multi-scale feature fusion module (RP-MFF), the lightweight and adaptive feature fusion module (LAFF), and the multi-task gradient flow guidance module (MTGFG). HRB-MPH facilitates global information interaction, enhances the multi-level granularity feature representation, and supports feature sharing between pixel-level and superpixel-level segmentation tasks. RP-MFF is used to capture the local multi-scale cloud features in the training stage and simplify losslessly the structure in the testing stage. LAFF is employed to reconstruct meaningful cloud features. MTGFG can provide unbiased guidance on dividing gradient flow between the multi-tasks. The experimental results show that the MTDR-Net obtains the greatest accuracy performance due to its anti-interference capacity for confusing ground objects and its excellent cloud boundary and thin cloud extraction ability. In addition, MTDR-Net has the fewest parameters, forward-reasoning floating point operations, memory footprint, and memory access volume. With the coarse-to-fine strategy, MTDR-Net can be further sped up by avoiding inference resource consumption in cloud-free/all-cloud scenes.

Keywords