MultiScale spectral–spatial convolutional transformer for hyperspectral image classification

Zhiqiang Gong; Xian Zhou; Wen Yao

doi:10.1049/ipr2.13254

IET Image Processing (Nov 2024)

MultiScale spectral–spatial convolutional transformer for hyperspectral image classification

Zhiqiang Gong,
Xian Zhou,
Wen Yao

Affiliations

Zhiqiang Gong: Intelligent Game and Decision Laboratory, Defense Innovation Institute Chinese Academy of Military Sciences Beijing China
Xian Zhou: National Key Laboratory of Science and Technology on ATR, College of Electrical Science and Technology National University of Defense Technology Changsha China
Wen Yao: Intelligent Game and Decision Laboratory, Defense Innovation Institute Chinese Academy of Military Sciences Beijing China

DOI: https://doi.org/10.1049/ipr2.13254
Journal volume & issue: Vol. 18, no. 13
pp. 4328 – 4340

Abstract

Read online

Abstract Due to the powerful ability in capturing the global information, transformer has become an alternative architecture of CNNs for hyperspectral image classification. However, general transformer mainly considers the global spectral information while ignores the multiscale spatial information of the hyperspectral image. In this paper, we propose a multiscale spectral–spatial convolutional transformer (MultiFormer) for hyperspectral image classification. First, the developed method utilizes multiscale spatial patches as tokens to formulate the spatial transformer and generates multiscale spatial representation of each band in each pixel. Second, the spatial representation of all the bands in a given pixel are utilized as tokens to formulate the spectral transformer and generate the multiscale spectral–spatial representation of each pixel. Besides, a modified spectral–spatial CAF module is constructed in the MultiFormer to fuse cross‐layer spectral and spatial information. Therefore, the proposed MultiFormer can capture the multiscale spectral–spatial information and provide better performance than most of other architectures for hyperspectral image classification. Experiments are conducted over commonly used real‐world datasets and the comparison results show the superiority of the proposed method.

Published in IET Image Processing

ISSN: 1751-9659 (Print); 1751-9667 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Technology: Photography; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519667

About the journal

Abstract

Keywords