CAAI Transactions on Intelligence Technology (Dec 2023)

Scale‐wise interaction fusion and knowledge distillation network for aerial scene recognition

  • Hailong Ning,
  • Tao Lei,
  • Mengyuan An,
  • Hao Sun,
  • Zhanxuan Hu,
  • Asoke K. Nandi

DOI
https://doi.org/10.1049/cit2.12208
Journal volume & issue
Vol. 8, no. 4
pp. 1178 – 1190

Abstract

Read online

Abstract Aerial scene recognition (ASR) has attracted great attention due to its increasingly essential applications. Most of the ASR methods adopt the multi‐scale architecture because both global and local features play great roles in ASR. However, the existing multi‐scale methods neglect the effective interactions among different scales and various spatial locations when fusing global and local features, leading to a limited ability to deal with challenges of large‐scale variation and complex background in aerial scene images. In addition, existing methods may suffer from poor generalisations due to millions of to‐be‐learnt parameters and inconsistent predictions between global and local features. To tackle these problems, this study proposes a scale‐wise interaction fusion and knowledge distillation (SIF‐KD) network for learning robust and discriminative features with scale‐invariance and background‐independent information. The main highlights of this study include two aspects. On the one hand, a global‐local features collaborative learning scheme is devised for extracting scale‐invariance features so as to tackle the large‐scale variation problem in aerial scene images. Specifically, a plug‐and‐play multi‐scale context attention fusion module is proposed for collaboratively fusing the context information between global and local features. On the other hand, a scale‐wise knowledge distillation scheme is proposed to produce more consistent predictions by distilling the predictive distribution between different scales during training. Comprehensive experimental results show the proposed SIF‐KD network achieves the best overall accuracy with 99.68%, 98.74% and 95.47% on the UCM, AID and NWPU‐RESISC45 datasets, respectively, compared with state of the arts.

Keywords