Remote Sensing (May 2023)
Joint Classification of Hyperspectral and LiDAR Data Using Binary-Tree Transformer Network
Abstract
The joint utilization of multi-source data is of great significance in geospatial observation applications, such as urban planning, disaster assessment, and military applications. However, this approach is confronted with challenges including inconsistent data structures, irrelevant physical properties, scarce training data, insufficient utilization of information and an imperfect feature fusion method. Therefore, this paper proposes a novel binary-tree Transformer network (BTRF-Net), which is used to fuse heterogeneous information and utilize complementarity among multi-source remote sensing data to enhance the joint classification performance of hyperspectral image (HSI) and light detection and ranging (LiDAR) data. Firstly, a hyperspectral network (HSI-Net) is employed to extract spectral and spatial features of hyperspectral images, while the elevation information of LiDAR data is extracted using the LiDAR network (LiDAR-Net). Secondly, a multi-source transformer complementor (MSTC) is designed that utilizes the complementarity and cooperation among multi-modal feature information in remote sensing images to better capture their correlation. The multi-head complementarity attention mechanism (MHCA) within this complementor can effectively capture global features and local texture information of images, hence achieving full feature fusion. Then, to fully obtain feature information of multi-source remote sensing images, this paper designs a complete binary tree structure, binary feature search tree (BFST), which fuses multi-modal features at different network levels to obtain multiple image features with stronger representation abilities, effectively enhancing the stability and robustness of the network. Finally, several groups of experiments are designed to compare and analyze the proposed BTRF-Net with traditional methods and several advanced deep learning networks using two datasets: Houston and Trento. The results show that the proposed network outperforms other state-of-the-art methods even with small training samples.
Keywords