Semantic Tokenization-Based Mamba for Hyperspectral Image Classification

Ri Ming; Na Chen; Jiangtao Peng; Weiwei Sun; Zhijing Ye

doi:10.1109/JSTARS.2025.3528122

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2025)

Semantic Tokenization-Based Mamba for Hyperspectral Image Classification

Ri Ming,
Na Chen,
Jiangtao Peng,
Weiwei Sun,
Zhijing Ye

Affiliations

Ri Ming: ORCiD; Hubei Key Laboratory of Applied Mathematics, Faculty of Mathematics and Statistics, the Key Laboratory of Intelligent Sensing System and Security, Ministry of Education, Hubei University, Wuhan, China
Na Chen: ORCiD; Hubei Key Laboratory of Applied Mathematics, Faculty of Mathematics and Statistics, the Key Laboratory of Intelligent Sensing System and Security, Ministry of Education, Hubei University, Wuhan, China
Jiangtao Peng: ORCiD; Hubei Key Laboratory of Applied Mathematics, Faculty of Mathematics and Statistics, the Key Laboratory of Intelligent Sensing System and Security, Ministry of Education, Hubei University, Wuhan, China
Weiwei Sun: ORCiD; Department of Geography and Spatial Information Techniques, Ningbo University, Ningbo, China
Zhijing Ye: ORCiD; Faculty of Innovation Engineering, Macau University of Science and Technology, Taipa, Macau

DOI: https://doi.org/10.1109/JSTARS.2025.3528122
Journal volume & issue: Vol. 18
pp. 4227 – 4241

Abstract

Read online

Recently, the transformer-based model has shown superior performance in hyperspectral image classification (HSIC) due to its excellent ability to model long-term dependencies on sequence data. An important component of the transformer is the tokenizer, which can transform the features into semantic token sequences (STS). Nonetheless, transformer's semantic tokenization strategy is hardly representative of local relatively important high-level semantics because of its global receptive field. Recently, the Mamba-based methods have shown even stronger spatial context modeling ability than Transformer for HSIC. However, these Mamba-based methods mainly focus on spectral and spatial dimensions. They tend to extract semantic information in very long feature sequences or represent semantic information in several typical tokens, which may ignore some important semantics of the HSIs. In order to represent the semantic information of HSIs more holistically in Mamba, this article proposes a semantic tokenization-based Mamba (STMamba) model. In STMamba, a spectral-spatial feature extraction module is used to extract the spectral–spatial joint features. Then, a generated semantic token sequences module is designed to transform the features into STS. Subsequently, the STS are fed into the semantic token state spatial model to capture relationships between different semantic tokens. Finally, the fused semantic token is passed into a classifier for classification. Experimental results on three HSI datasets demonstrate that the proposed STMamba outperforms existing state-of-the-art deep learning and transformer-based methods.

Published in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

ISSN: 1939-1404 (Print); 2151-1535 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Ocean engineering; Science: Physics: Geophysics. Cosmic physics
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=4609443

About the journal

Abstract

Keywords