DETR-crowd is all you need

Liu Weijia; Zishen Zheng; Ke Fan; Kun He; Taiqiu Huang; Weijia Liu; Xianlun Ke; Yuming Xu

doi:10.47813/2782-2818-2023-3-2-0213-0224

Современные инновации, системы и технологии (May 2023)

DETR-crowd is all you need

Liu Weijia ,
Zishen Zheng ,
Ke Fan ,
Kun He ,
Taiqiu Huang ,
Weijia Liu ,
Xianlun Ke ,
Yuming Xu

Affiliations

Liu Weijia: Trine University, Phoenix, USA
Zishen Zheng: Taiyuan University of Technology, Taiyuan, China
Ke Fan: Arizona State University, Phoenix, USA
Kun He: Illinois Institute of Technology, Chicago, USA
Taiqiu Huang: Shenzhen University, Shenzhen, China
Weijia Liu: Trine University, Phoenix, United States
Xianlun Ke: Yunnan University, Kunming, China
Yuming Xu: Shenzhen University, Shenzhen, China

DOI: https://doi.org/10.47813/2782-2818-2023-3-2-0213-0224
Journal volume & issue: Vol. 3, no. 2

Abstract

Read online

"Crowded pedestrian detection" is a hot topic in the field of pedestrian detection. To address the issue of missed targets and small pedestrians in crowded scenes, an improved DETR object detection algorithm called DETR-crowd is proposed. The attention model DETR is used as the baseline model to complete object detection in the absence of partial features in crowded pedestrian scenes. The deformable attention encoder is introduced to effectively utilize multi-scale feature maps containing a large amount of small target information to improve the detection accuracy of small pedestrians. To enhance the efficiency of important feature extraction and refinement, the improved EfficientNet backbone network fused with a channel spatial attention module is used for feature extraction. To address the issue of low training efficiency of models that use attention detection modules, Smooth-L1 and GIOU are combined as the loss function during training, allowing the model to converge to higher precision. Experimental results on the Wider-Person crowded pedestrian detection dataset show that the proposed algorithm leads YOLO-X by 0.039 in AP50 accuracy and YOLO-V5 by 0.015 in AP50 accuracy. The proposed algorithm can be effectively applied to crowded pedestrian detection tasks.

Published in Современные инновации, системы и технологии

ISSN: 2782-2826 (Print); 2782-2818 (Online)
Publisher: Siberian Scientific Centre DNIT
Country of publisher: Russian Federation
LCC subjects: Technology: Technology (General)
Website: https://oajmist.com/

About the journal