IEEE Access (Jan 2023)

Hyperspectral Image Classification: An Analysis Employing CNN, LSTM, Transformer, and Attention Mechanism

  • Felipe Viel,
  • Renato Cotrim Maciel,
  • Laio Oriel Seman,
  • Cesar Albenes Zeferino,
  • Eduardo Augusto Bezerra,
  • Valderi Reis Quietinho Leithardt

DOI
https://doi.org/10.1109/ACCESS.2023.3255164
Journal volume & issue
Vol. 11
pp. 24835 – 24850

Abstract

Read online

Hyperspectral images contain tens to hundreds of bands, implying a high spectral resolution. This high spectral resolution allows for obtaining a precise signature of structures and compounds that make up the captured scene. Among the types of processing that may be applied to Hyperspectral Images, classification using machine learning models stands out. The classification process is one of the most relevant steps for this type of image. It can extract information using spatial and spectral information and spatial-spectral fusion. Artificial Neural Network models have been gaining prominence among existing classification techniques. They can be applied to data with one, two, or three dimensions. Given the above, this work evaluates Convolutional Neural Network models with one, two, and three dimensions to identify the impact of classifying Hyperspectral Images with different types of convolution. We also expand the comparison to Recurrent Neural Network models, Attention Mechanism, and the Transformer architecture.. Furthermore, a novelty pre-processing method is proposed for the classification process to avoid generating data leaks between training, validation, and testing data. The results demonstrated that using 1 Dimension Convolutional Neural Network (1D-CNN), Long Short-Term Memory (LSTM), and Transformer architectures reduces memory consumption and sample processing time and maintain a satisfactory classification performance up to 99% accuracy on larger datasets. In addition, the Transfomer architecture can approach the 2D-CNN and 3D-CNN architectures in accuracy using only spectral information. The results also show that using two or three dimensions convolution layers improves accuracy at the cost of greater memory consumption and processing time per sample. Furthermore, the pre-processing methodology guarantees the disassociation of training and testing data.

Keywords