GroupFormer for hyperspectral image classification through group attention

Rahim Khan; Tahir Arshad; Xuefei Ma; Haifeng Zhu; Chen Wang; Javed Khan; Zahid Ullah Khan; Sajid Ullah Khan

doi:10.1038/s41598-024-74835-1

Scientific Reports (Oct 2024)

GroupFormer for hyperspectral image classification through group attention

Rahim Khan,
Tahir Arshad,
Xuefei Ma,
Haifeng Zhu,
Chen Wang,
Javed Khan,
Zahid Ullah Khan,
Sajid Ullah Khan

Affiliations

Rahim Khan: College of Information and Communication Engineering, Harbin Engineering University
Tahir Arshad: School of Electronics and Information Engineering, Harbin Institute of Technology
Xuefei Ma: College of Information and Communication Engineering, Harbin Engineering University
Haifeng Zhu: College of Information and Communication Engineering, Harbin Engineering University
Chen Wang: College of Information and Communication Engineering, Harbin Engineering University
Javed Khan: Department of Software Engineering, University of Science and Technology
Zahid Ullah Khan: College of Information and Communication Engineering, Harbin Engineering University
Sajid Ullah Khan: Department of Information Systems, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University

DOI: https://doi.org/10.1038/s41598-024-74835-1
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 19

Abstract

Read online

Abstract Hyperspectral image (HSI) data has a wide range of valuable spectral information for numerous tasks. HSI data encounters challenges such as small training samples, scarcity, and redundant information. Researchers have introduced various research works to address these challenges. Convolution Neural Network (CNN) has gained significant success in the field of HSI classification. CNN’s primary focus is to extract low-level features from HSI data, and it has a limited ability to detect long-range dependencies due to the confined filter size. In contrast, vision transformers exhibit great success in the HSI classification field due to the use of attention mechanisms to learn the long-range dependencies. As mentioned earlier, the primary issue with these models is that they require sufficient labeled training data. To address this challenge, we proposed a spectral-spatial feature extractor group attention transformer that consists of a multiscale feature extractor to extract low-level or shallow features. For high-level semantic feature extraction, we proposed a group attention mechanism. Our proposed model is evaluated using four publicly available HSI datasets, which are Indian Pines, Pavia University, Salinas, and the KSC dataset. Our proposed approach achieved the best classification results in terms of overall accuracy (OA), average accuracy (AA), and Kappa coefficient. As mentioned earlier, the proposed approach utilized only 5%, 1%, 1%, and 10% of the training samples from the publicly available four datasets.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal

Abstract

Keywords