IEEE Access (Jan 2024)

YOLOX-CA: A Remote Sensing Object Detection Model Based on Contextual Feature Enhancement and Attention Mechanism

  • Chao Wu,
  • Zhiyong Zeng

DOI
https://doi.org/10.1109/ACCESS.2024.3414426
Journal volume & issue
Vol. 12
pp. 84632 – 84642

Abstract

Read online

Compared to natural images, remote sensing images have the characteristics of high spatial resolution, large target scale variation, dense target distribution, and complex background. Consequently, there are challenges with insufficient detection accuracy and the inability to identify target locations accurately. Therefore, this paper introduces the YOLOX-CA algorithm, based on the YOLOX model, to address these challenges in remote sensing object detection. Firstly, the YOLOX-CA algorithm optimizes the feature extraction network of the YOLOX model. This optimization employs large-kernel depthwise separable convolution in the backbone network to enhance feature extraction capabilities, comprehensively and accurately capturing information features. Secondly, the ACmix attention mechanism is introduced into the backbone network to identify crucial features, enhance feature extraction capability, and expedite network convergence. Lastly, a Contextual Feature Enhancement (CFE) module is constructed and employed in the upsampling process of feature fusion, aiming to augment the model’s awareness of context. Experimental results on the large-scale DIOR dataset for remote sensing object detection demonstrate performance enhancements over the baseline model, with increases of 2.7% in mAP, 1.1% in [email protected], and 2.2% in Recall. The findings from the test dataset suggest that the proposed YOLOX-CA method is applicable and practical for remote sensing object detection, improving detection accuracy while mitigating instances of target omission.

Keywords