International Journal of Digital Earth (Dec 2024)
Scene classification for remote sensing image of land use and land cover using dual-model architecture with multilevel feature fusion
Abstract
ABSTRACTScene classification for remote sensing image (RSI) of land use and land cover (LULC) involves identifying discriminative features of interest in different classes. Spurred by the powerful feature extraction capability of Convolutional Neural Networks (CNNs), LULC classification for RSI has rapidly developed in recent years. Although multi-models have better classification performance than single-models, combining multi-models is still the key to maximizing classification accuracy. Thus, this paper proposes a dual-model architecture with multilevel feature fusion called XE-Net. Specifically, high, middle, and low-level features are extracted by the Xception and EfficientNet-V2, respectively, and through transfer learning, the model’s weighting parameters from the ImageNet data are shared. Moreover, the designed sibling feature fusion algorithm fuses the triple-level features extracted from the dual models sequentially according to the same level. Besides, the proposed multi-scale feature fusion method systematically enhances the fused three-scale features to improve the discriminative feature. Finally, the discriminative feature is input into the classifier to obtain the classification results. The maximum average overall accuracy obtained from sufficient experiments using XE-Net on the RSSCN-7 dataset is 96.84%, while WHU-19, UCM-21, OPTIMAL-31, NWPU-RESISC45, and AID attain 99.58%, 99.37%, 97.07%, 95.03%, and 95.78%, respectively, demonstrating our model’s superiority.
Keywords