IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2023)

G2Grad-CAMRL: An Object Detection and Interpretation Model Based on Gradient-Weighted Class Activation Mapping and Reinforcement Learning in Remote Sensing Images

  • Shoulin Yin,
  • Liguo Wang,
  • Muhammad Shafiq,
  • Lin Teng,
  • Asif Ali Laghari,
  • Muhammad Faizan Khan

DOI
https://doi.org/10.1109/JSTARS.2023.3241405
Journal volume & issue
Vol. 16
pp. 3583 – 3598

Abstract

Read online

Remote sensing images (RSIs) contain important information, such as airports, ports, and ships. By extracting RSI features and learning the mapping relationship between image features and text semantic features, the interpretation and description of RSI content can be realized, which has a wide range of application value in military and civil fields, such as national defense security, land monitoring, urban planning, and disaster mitigation. Aiming at the complex background of RSIs and the lack of interpretability of existing target detection models, and the problems in feature extraction between different network structures, different layers, and the accuracy of target classification, we propose an object detection and interpretation model based on gradient-weighted class activation mapping and reinforcement learning. First, ResNet is used as the main backbone network to extract the features of RSIs and generate feature graphs. Then, we add the global average pooling layer to obtain the corresponding feature weight vector of the feature graph. The weighted vectors are superimposed to output class activation maps. The reinforcement learning method is used to optimize the generated region generation network. At the same time, we improve the reward function of reinforcement learning to improve the effectiveness of the region generation network. Finally, network dissecting analysis is used to obtain the interpretable semantic concept in the model. Through experiments, the average accuracy is more than 85%. Experimental results in the public RSI description dataset show that the proposed method has high detection accuracy and good description performance for RSIs in complex environments.

Keywords