IET Image Processing (Jun 2021)
Efficient recurrent attention network for remote sensing scene classification
Abstract
Abstract Scene classification for remote sensing is a popular topic, and many recent convolutional neural networks (CNNs)‐based methods have shown the great model capacity and learning ability of highly discriminative features. Given a large number of training data, CNN can extract extensive features and learn to predict a remote sensing image. However, for supervised learning tasks, deep models often rely on a large number of labelled remote sensing images, which are difficult to pre‐process. Thus, training a lightweight deep learning model is essential. Easy‐classified and hard samples may also cause an imbalance of training set and lead the model to overwhelm the loss function. Accordingly, a novel Efficient Recurrent Attention Network (ERANet) for remote sensing scene classification is proposed. Different from traditional deep learning methods, Efficientnet‐B0 is introduced as a lightweight backbone for the ARCNet framework, replacing the original one. By applying the modified efficient backbone, the low Floating Point Operations (FLOPs) and parameter numbers of the proposed ERANet are maintained. The significance of focal loss is determined and applied to address the sample imbalance problem and yield a desirable performance. Extensive experiments on several challenging remote sensing scene classification data sets prove the efficiency of the proposed ERANet.
Keywords