A Dual Multi-Head Contextual Attention Network for Hyperspectral Image Classification

Miaomiao Liang; Qinghua He; Xiangchun Yu; Huai Wang; Zhe Meng; Licheng Jiao

doi:10.3390/rs14133091

Remote Sensing (Jun 2022)

A Dual Multi-Head Contextual Attention Network for Hyperspectral Image Classification

Miaomiao Liang,
Qinghua He,
Xiangchun Yu,
Huai Wang,
Zhe Meng,
Licheng Jiao

Affiliations

Miaomiao Liang: School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China
Qinghua He: School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China
Xiangchun Yu: School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China
Huai Wang: School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China
Zhe Meng: School of Telecommunication and Information Engineering, Xi’an University of Posts and Telecommunications, Xi’an 710121, China
Licheng Jiao: Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi’an 710071, China

DOI: https://doi.org/10.3390/rs14133091
Journal volume & issue: Vol. 14, no. 13
p. 3091

Abstract

Read online

To learn discriminative features, hyperspectral image (HSI), containing 3-D cube data, is a preferable means of capturing multi-head self-attention from both spatial and spectral domains if the burden in model optimization and computation is low. In this paper, we design a dual multi-head contextual self-attention (DMuCA) network for HSI classification with the fewest possible parameters and lower computation costs. To effectively capture rich contextual dependencies from both domains, we decouple the spatial and spectral contextual attention into two sub-blocks, SaMCA and SeMCA, where depth-wise convolution is employed to contextualize the input keys in the pure dimension. Thereafter, multi-head local attentions are implemented as group processing when the keys are alternately concatenated with the queries. In particular, in the SeMCA block, we group the spatial pixels by evenly sampling and create multi-head channel attention on each sampling set, to reduce the number of the training parameters and avoid the storage increase. In addition, the static contextual keys are fused with the dynamic attentional features in each block to strengthen the capacity of the model in data representation. Finally, the decoupled sub-blocks are weighted and summed together for 3-D attention perception of HSI. The DMuCA module is then plugged into a ResNet to perform HSI classification. Extensive experiments demonstrate that our proposed DMuCA achieves excellent results over several state-of-the-art attention mechanisms with the same backbone.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords