IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2024)
TCCU-Net: Transformer and CNN Collaborative Unmixing Network for Hyperspectral Image
Abstract
In recent years, deep-learning-based hyperspectral unmixing techniques have garnered increasing attention and made significant advancements. However, relying solely on the use of convolutional neural network (CNN) or transformer approaches is insufficient for effectively capturing both global and fine-grained information, thereby compromising the accuracy of unmixing tasks. In order to fully harness the information contained within hyperspectral images, this article explores a dual-stream collaborative network, referred to as TCCU-Net. It end-to-end learns information in four dimensions: spectral, spatial, global, and local, to achieve more effective unmixing. The network comprises two core encoders: one is a transformer encoder, which includes squeeze-launch modules, DSSCR–vision transformer modules, and stripe pooling modules, while the other one is a CNN encoder, which is composed of two-dimensional (2-D) pyramid convolutions and 3-D pyramid convolutions. By fusing the outputs of these two encoders, the semantic gap between the encoder and decoder is bridged, resulting in improved feature mapping and unmixing outcomes. This article extensively evaluates TCCU-Net and seven hyperspectral unmixing methods on four datasets (Samson, Apex, Jasper Ridge, and Synthetic dataset). The experimental results firmly demonstrate that the proposed approach surpasses others in terms of accuracy, holding the potential to effectively address hyperspectral unmixing tasks.
Keywords