Multi-Scale Structure Perception and Global Context-Aware Method for Small-Scale Pedestrian Detection

Hao Gao; Shucheng Huang; Mingxing Li; Tian Li

doi:10.1109/ACCESS.2024.3406968

IEEE Access (Jan 2024)

Multi-Scale Structure Perception and Global Context-Aware Method for Small-Scale Pedestrian Detection

Hao Gao,
Shucheng Huang,
Mingxing Li,
Tian Li

Affiliations

Hao Gao: ORCiD; School of Computer, Jiangsu University of Science and Technology, Zhenjiang, China
Shucheng Huang: ORCiD; School of Computer, Jiangsu University of Science and Technology, Zhenjiang, China
Mingxing Li: ORCiD; Jingjiang College, Jiangsu University, Zhenjiang, China
Tian Li: ORCiD; Suzhou Institute of Technology, Jiangsu University of Science and Technology, Suzhou, China

DOI: https://doi.org/10.1109/ACCESS.2024.3406968
Journal volume & issue: Vol. 12
pp. 76392 – 76403

Abstract

Read online

In pedestrian detection, small-scale pedestrians often face challenges such as limited pixel values and insufficient features, often leading to wrong or missed detection. Therefore, this paper proposed a multi-scale structure perception and global context-aware method for small-scale pedestrian detection. Firstly, to address the issue of decreasing features caused by the network deepens, we designed a feature fusion strategy to overcome the constraints of the feature pyramid hierarchy. This strategy combines deep and shallow feature maps and leverages the advantages of Transformer to capture long-distance dependent features, incorporating a global context information module to retain a substantial amount of small-scale pedestrian features. Secondly, considering the confusion between small-scale pedestrian features and background information, we employed a combination of self-attention modules and channel attention modules to jointly model the spatial and channel correlations of feature maps. This utilization of small-scale pedestrian context and channel information enhances small-scale pedestrian features while suppressing background information. Finally, to address the issue of gradient explosion during model training, we introduced a novel weighted loss function named ES-IoU, which significantly improved the convergence speed. Extensive experimental results on the CityPersons and CrowdHuman datasets demonstrate that the proposed method achieves a substantial improvement upon state-of-the-art methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords