Applied Sciences (Aug 2022)

Exploiting Hierarchical Label Information in an Attention-Embedding, Multi-Task, Multi-Grained, Network for Scene Classification of Remote Sensing Imagery

  • Peng Zeng,
  • Shixuan Lin,
  • Hao Sun,
  • Dongbo Zhou

DOI
https://doi.org/10.3390/app12178705
Journal volume & issue
Vol. 12, no. 17
p. 8705

Abstract

Read online

Remote sensing scene classification aims to automatically assign proper labels to remote sensing images. Most of the existing deep learning based methods usually consider the interclass and intraclass relationships of the image content for classification. However, these methods rarely consider the hierarchical information of scene labels, as a scene label may belong to hierarchically multi-grained levels. For example, multi-grained level labels may indicate that a remote sensing scene image may belong to the coarse-grained label “transportation land” while also belonging to the fine-grained label “airport”. In this paper, to exploit hierarchical label information, we propose an attention-embedding multi-task multi-grained network (AEMMN) for remote sensing scene classification. In the proposed AEMMN, we add a coarse-grained classifier as the first level and a fine-grained classifier as the second level to perform multi-task learning tasks. Additionally, a gradient control module is utilized to control the gradient propagation of two classifiers to suppress the negative transfer caused by the irrelevant features between tasks. In the feature extraction portion, the model uses an ECA module embedding Resnet50 to extract effective features with cross-channel interaction information. Furthermore, an external attention module is exploited to improve the discrimination of fine-grained and coarse-grained features. Experiments were conducted on the NWPU-RESISC45 and the Aerial Image Data Set (AID), and the overall accuracy of the proposed AEMMN is 92.07% on the NWPU-RESISC45 dataset and reached 94.96% on the AID. The results indicate that hierarchical label information can effectively improve the performance of scene classification tasks when categorizing remote sensing imagery.

Keywords