IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2021)
Multisensor Land Cover Classification With Sparsely Annotated Data Based on Convolutional Neural Networks and Self-Distillation
Abstract
Extensive research studies have been conducted in recent years to exploit the complementarity among multisensor (or multimodal) remote sensing data for prominent applications such as land cover mapping. In order to make a step further with respect to previous studies, which investigate multitemporal SAR and optical data or multitemporal/multiscale optical combinations, here, we propose a deep learning framework that simultaneously integrates all these input sources, specifically multitemporal SAR/optical data and fine-scale optical information at their native temporal and spatial resolutions. Our proposal relies on a patch-based multibranch convolutional neural network (CNN) that exploits different per-source encoders to deal with the specificity of the input signals. In addition, we introduce a new self-distillation strategy to boost the per-source analyses and exploit the interplay among the different input sources. This new strategy leverages the final prediction of the multisource framework to guide the learning of the per-source CNN encoders supporting the network to learn from itself. Experiments are carried out on two real-world benchmarks, namely, the Reunion island (a French overseas department) and the Dordogne study site (a southwest department in France), where the annotated reference data were collected under operational constraints (sparsely annotated ground-truth data). Obtained results providing an overall classification accuracy of about 94% (respectively, 88%) on the Reunion island (respectively, the Dordogne) study site highlight the effectiveness of our framework based on CNNs and self-distillation to combine heterogeneous multisensor remote sensing data and confirm the benefit of multimodal analysis for downstream tasks such as land cover mapping.
Keywords