IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2025)
Semantic Tokenization-Based Mamba for Hyperspectral Image Classification
Abstract
Recently, the transformer-based model has shown superior performance in hyperspectral image classification (HSIC) due to its excellent ability to model long-term dependencies on sequence data. An important component of the transformer is the tokenizer, which can transform the features into semantic token sequences (STS). Nonetheless, transformer's semantic tokenization strategy is hardly representative of local relatively important high-level semantics because of its global receptive field. Recently, the Mamba-based methods have shown even stronger spatial context modeling ability than Transformer for HSIC. However, these Mamba-based methods mainly focus on spectral and spatial dimensions. They tend to extract semantic information in very long feature sequences or represent semantic information in several typical tokens, which may ignore some important semantics of the HSIs. In order to represent the semantic information of HSIs more holistically in Mamba, this article proposes a semantic tokenization-based Mamba (STMamba) model. In STMamba, a spectral-spatial feature extraction module is used to extract the spectral–spatial joint features. Then, a generated semantic token sequences module is designed to transform the features into STS. Subsequently, the STS are fed into the semantic token state spatial model to capture relationships between different semantic tokens. Finally, the fused semantic token is passed into a classifier for classification. Experimental results on three HSI datasets demonstrate that the proposed STMamba outperforms existing state-of-the-art deep learning and transformer-based methods.
Keywords