CA-YOLO: Model Optimization for Remote Sensing Image Object Detection

Lingyun Shen; Baihe Lang; Zhengxun Song

doi:10.1109/ACCESS.2023.3290480

IEEE Access (Jan 2023)

CA-YOLO: Model Optimization for Remote Sensing Image Object Detection

Lingyun Shen,
Baihe Lang,
Zhengxun Song

Affiliations

Lingyun Shen: ORCiD; Department of Electronic Engineering, Taiyuan Institute of Technology, Taiyuan, China
Baihe Lang: School of Electronics and Information Engineering, Changchun University of Science and Technology, Changchun, China
Zhengxun Song: Overseas Expertise Introduction Project for Discipline Innovation No. D17017, Changchun University of Science and Technology, Changchun, China

DOI: https://doi.org/10.1109/ACCESS.2023.3290480
Journal volume & issue: Vol. 11
pp. 64769 – 64781

Abstract

Read online

The CA-YOLO (Coordinate Attention-YOLO) model has been optimized for object detection in complex remote sensing images, addressing key issues faced by algorithms that detect multiple objects. These issues include weak multi-scale feature learning capabilities and the challenging trade-off between detection accuracy and model parameter complexity. The CA-YOLO model, built on the framework of YOLOv5, incorporates a lightweight coordinate attention module in the shallow layer to improve detailed feature extraction and suppress redundant information interference. Additionally, a spatial pyramid pooling-fast with a tandem construction module is implemented in the deeper layer. The model employs a stochastic pooling strategy to fuse multi-scale key feature information from low-level to high-level layers, reducing the number of model parameters while improving inference speed. We optimized the anchor box mechanism and modified loss function to improve the ability of the model to detect objects of different sizes and scales. Results show that the CA-YOLO model outperforms the original YOLO in terms of multi-object detection accuracy, with an average [email protected] accuracy improvement of 4.8% and [email protected]:0.95 accuracy improvement of 3.8%. Additionally, the CA-YOLO model demonstrates exceptional inference speed, averaging 125 fps, which reinforces its superiority in detection accuracy, generalization ability, and overall efficiency. Notably, these improvements were achieved while maintaining the same number of parameters and complexity as other models, making the CA-YOLO model an exceptional choice for various applications.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords