Decoupled Knowledge Distillation via Spatial Feature Blurring for Hyperspectral Image Classification

Wen Xie; ZheZhe Zhang; Licheng Jiao; Jin Wang

doi:10.1109/JSTARS.2024.3383854

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2024)

Decoupled Knowledge Distillation via Spatial Feature Blurring for Hyperspectral Image Classification

Wen Xie,
ZheZhe Zhang,
Licheng Jiao,
Jin Wang

Affiliations

Wen Xie: ORCiD; School of Communication and Information Engineering, Xi'an University of Posts and Telecommunications, Xi'an, China
ZheZhe Zhang: ORCiD; School of Communication and Information Engineering, Xi'an University of Posts and Telecommunications, Xi'an, China
Licheng Jiao: ORCiD; Key Laboratory of Intelligent Perception and Image Understanding, Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, China
Jin Wang: ORCiD; School of Communication and Information Engineering, Xi'an University of Posts and Telecommunications, Xi'an, China

DOI: https://doi.org/10.1109/JSTARS.2024.3383854
Journal volume & issue: Vol. 17
pp. 8938 – 8955

Abstract

Read online

It is well known that distillation learning has the ability to enhance the performance of a light (student) model by transferring knowledge from a heavy (teacher) model, without incurring additional computational and storage costs. This article proposes an improved decoupled knowledge distillation (DKD) strategy for hyperspectral image (HSI) classification. A spatial feature blurring (SFB) module is designed to improve the classification performance of the student network when using DKD strategy. The SFB module utilizes randomly initialized 2-D standard normal distribution tensors to blur the spatial features of HSI, which increases the complexity of the data. This aligns with the characteristics of DKD, which transfers more useful knowledge under the condition of sample complexity. To effectively transfer knowledge, this article proposes a robust teacher network named the dual-branch spatial transformer-spectral transformer (DBSTST) network. This network describes the spatial and spectral long-range dependencies of HSI, addressing the limitations of convolutional neural networks in capturing only local features due to their fixed receptive fields. More specifically, the DBSTST network adopts spatial transformer-spectral transformer, which is composed of a parallel spatial-spectral multihead self-attention (PS2MHSA) module, aiming to describe pixel-level spatial long-range dependencies and spectral correlations in HSI. Simultaneously, the introduction of spatial-spectral positional embedding into PS2MHSA enhances positional awareness. We demonstrated the effectiveness of our proposed method on four publicly available HSI datasets. The student network achieves classification performance improvement and surpasses some other networks. Moreover, when compared with state-of-the-art classification methods, the DBSTST network also exhibits significant improvements in classification performance.

Published in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

ISSN: 1939-1404 (Print); 2151-1535 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Ocean engineering; Science: Physics: Geophysics. Cosmic physics
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=4609443

About the journal

Abstract

Keywords