Remote Sensing (Jun 2023)
Improving Feature Learning in Remote Sensing Images Using an Integrated Deep Multi-Scale 3D/2D Convolutional Network
Abstract
Developing complex hyperspectral image (HSI) sensors that capture high-resolution spatial information and voluminous (hundreds) spectral bands of the earth’s surface has made HSI pixel-wise classification a reality. The 3D-CNN has become the preferred HSI pixel-wise classification approach because of its ability to extract discriminative spectral and spatial information while maintaining data integrity. However, HSI datasets are characterized by high nonlinearity, voluminous spectral features, and limited training sample data. Therefore, developing deep HSI classification methods that purely utilize 3D-CNNs in their network structure often results in computationally expensive models prone to overfitting when the model depth increases. In this regard, this paper proposes an integrated deep multi-scale 3D/2D convolutional network block (MiCB) for simultaneous low-level spectral and high-level spatial feature extraction, which can optimally train on limited sample data. The strength of the proposed MiCB model solely lies in the innovative arrangement of convolution layers, giving the network the ability (i) to simultaneously convolve the low-level spectral with high-level spatial features; (ii) to use multiscale kernels to extract abundant contextual information; (iii) to apply residual connections to solve the degradation problem when the model depth increases beyond the threshold; and (iv) to utilize depthwise separable convolutions in its network structure to address the computational cost of the proposed MiCB model. We evaluate the efficacy of our proposed MiCB model using three publicly accessible HSI benchmarking datasets: Salinas Scene (SA), Indian Pines (IP), and the University of Pavia (UP). When trained on small amounts of training sample data, MiCB is better at classifying than the state-of-the-art methods used for comparison. For instance, the MiCB achieves a high overall classification accuracy of 97.35%, 98.29%, and 99.20% when trained on 5% IP, 1% UP, and 1% SA data, respectively.
Keywords