A Cross-Attention-Based Multi-Information Fusion Transformer for Hyperspectral Image Classification

Jinghui Yang; Anqi Li; Jinxi Qian; Jia Qin; Liguo Wang

doi:10.1109/JSTARS.2024.3429492

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2024)

A Cross-Attention-Based Multi-Information Fusion Transformer for Hyperspectral Image Classification

Jinghui Yang,
Anqi Li,
Jinxi Qian,
Jia Qin,
Liguo Wang

Affiliations

Jinghui Yang: ORCiD; School of Information Engineering, China University of Geosciences (Beijing), Beijing, China
Anqi Li: School of Information Engineering, China University of Geosciences (Beijing), Beijing, China
Jinxi Qian: Institute of Telecommunication and Navigation Satellites, China Academy of Space Technology, Beijing, China
Jia Qin: School of Information Engineering, China University of Geosciences (Beijing), Beijing, China
Liguo Wang: ORCiD; College of Information and Communication Engineering, Dalian Minzu University, Dalian, China

DOI: https://doi.org/10.1109/JSTARS.2024.3429492
Journal volume & issue: Vol. 17
pp. 13358 – 13375

Abstract

Read online

In recent years, deep-learning-based classification methods have been widely used for hyperspectral images (HSIs). However, in the existing transformer-based HSI classification methods, how to effectively and comprehensively utilize the rich information still has room for improvement, for example, when utilizing multiple-image information, the comprehensive interaction between information has insufficient consideration. To address the above issues, cross-attention interaction, class token and patch token information, and multiscale spatial information are addressed in a unified framework, and a cross-attention-based multi-information fusion transformer (CAMFT) for HSI classification was proposed, which includes the multiscale patch embedding module, the residual connection-based DeepViT (RCD) module, and the double-branch cross-attention (DBCA) module. First, the multiscale patch embedding module is formed for multi-information preprocessing, accompanied by the built of different scale processing branches and the addition of learnable class tokens. Second, the RCD module is designed to utilize rich information from different layers; this module includes reattention and residual connection. Third, a DBCA module is constructed to obtain more representative multi-information fusion features; this module not only integrates multiscale patch information but also effectively utilizes complementary information between class tokens and patch tokens in the interaction of two branches. Moreover, numerous experiments demonstrate that, compared with other state-of-the-art classification methods, the proposed CAMFT method achieves the optimal classification performance, especially with a small training sample size, but it still has excellent performance.

Published in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

ISSN: 1939-1404 (Print); 2151-1535 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Ocean engineering; Science: Physics: Geophysics. Cosmic physics
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=4609443

About the journal

Abstract

Keywords