European Journal of Remote Sensing (Dec 2024)
A hybrid convolution transformer for hyperspectral image classification
Abstract
Hyperspectral images play a crucial role in remote sensing applications surveillance, environment and precision agriculture, containing abundant object information. However, they often face challenges such as limited labelled data and imbalanced classes. In recent years, convolutional neural networks (CNNs) have shown impressive performance in computer vision tasks, including hyperspectral image classification. The emergence of transformers has also attracted attention for hyperspectral image analysis due to their promising capabilities. Nevertheless, transformers typically demand a substantial amount of training data, making their application challenging in scenarios with limited labelled samples. To overcome this limitation, we propose a hybrid convolution transformer framework. Our method uses a vision transformer and a residual 3D convolutional neural network model. It also uses a sequence aggregation layer to avoid overfitting issues that come up when there isn’t enough training data. Our proposed residual channel attention module captures richer spatial-spectral complementary information and maintains spectral details during the feature extraction process. We conducted experiments on three benchmark datasets. The proposed model achieved state of the art performance[Formula: see text], [Formula: see text]and [Formula: see text] in terms of overall accuracy (OA) using only [Formula: see text],[Formula: see text]and [Formula: see text]labelled training samples respectively. Which is better than other state of the art methods.
Keywords