TO–YOLOX: a pure CNN tiny object detection model for remote sensing images

Zhe Chen; Yuan Liang; Zhengbo Yu; Ke Xu; Qingyun Ji; Xueqi Zhang; Quanping Zhang; Zijia Cui; Ziqiong He; Ruichun Chang; Zhongchang Sun; Keyan Xiao; Huadong Guo

doi:10.1080/17538947.2023.2261901

International Journal of Digital Earth (Dec 2023)

TO–YOLOX: a pure CNN tiny object detection model for remote sensing images

Zhe Chen,
Yuan Liang,
Zhengbo Yu,
Ke Xu,
Qingyun Ji,
Xueqi Zhang,
Quanping Zhang,
Zijia Cui,
Ziqiong He,
Ruichun Chang,
Zhongchang Sun,
Keyan Xiao,
Huadong Guo

Affiliations

Zhe Chen: Chengdu University of Technology
Yuan Liang: Chengdu University of Technology
Zhengbo Yu: Chengdu University of Technology
Ke Xu: Chengdu University of Technology
Qingyun Ji: Chengdu University of Technology
Xueqi Zhang: Chengdu University of Technology
Quanping Zhang: China University of Geosciences (Beijing)
Zijia Cui: China University of Geosciences (Beijing)
Ziqiong He: Digital Hu Line Research Institute, Chengdu University of Technology
Ruichun Chang: Chengdu University of Technology
Zhongchang Sun: Aerospace Information Research Institute, Chinese Academy of Sciences
Keyan Xiao: Institute of Mineral Resources, Chinese Academy of Geological Sciences
Huadong Guo: Aerospace Information Research Institute, Chinese Academy of Sciences

DOI: https://doi.org/10.1080/17538947.2023.2261901
Journal volume & issue: Vol. 16, no. 1
pp. 3882 – 3904

Abstract

Read online

Remote sensing and deep learning are being widely combined in tasks such as urban planning and disaster prevention. However, due to interference occasioned by density, overlap, and coverage, the tiny object detection in remote sensing images has always been a difficult problem. Therefore, we propose a novel TO–YOLOX(Tiny Object–You Only Look Once) model. TO–YOLOX possesses a MiSo(Multiple-in-Single-out) feature fusion structure, which exhibits a spatial-shift structure, and the model balances positive and negative samples and enhances the information interaction pertaining to the local patch of remote sensing images. TO–YOLOX utilizes an adaptive IOU-T (Intersection Over Uni-Tiny) loss to enhance the localization accuracy of tiny objects, and it applies attention mechanism Group-CBAM (group-convolutional block attention module) to enhance the perception of tiny objects in remote sensing images. To verify the effectiveness and efficiency of TO–YOLOX, we utilized three aerial-photography tiny object detection datasets, namely VisDrone2021, Tiny Person, and DOTA–HBB, and the following mean average precision (mAP) values were recorded, respectively: 45.31% (+10.03%), 28.9% (+9.36%), and 63.02% (+9.62%). With respect to recognizing tiny objects, TO–YOLOX exhibits a stronger ability compared with Faster R-CNN, RetinaNet, YOLOv5, YOLOv6, YOLOv7, and YOLOX, and the proposed model exhibits fast computation.

Published in International Journal of Digital Earth

ISSN: 1753-8947 (Print); 1753-8955 (Online)
Publisher: Taylor & Francis Group
Country of publisher: United Kingdom
LCC subjects: Geography. Anthropology. Recreation: Mathematical geography. Cartography
Website: https://www.tandfonline.com/journals/tjde

About the journal

Abstract

Keywords