EHAFF-NET: Enhanced Hybrid Attention and Feature Fusion for Pedestrian ReID

Jun Yang; Yan Wang; Haizhen Xie; Jiayue Chen; Shulong Sun; Xiaolan Zhang

doi:10.3390/math13040660

Mathematics (Feb 2025)

EHAFF-NET: Enhanced Hybrid Attention and Feature Fusion for Pedestrian ReID

Jun Yang,
Yan Wang,
Haizhen Xie,
Jiayue Chen,
Shulong Sun,
Xiaolan Zhang

Affiliations

Jun Yang: Big Data and Internet of Things Research Center, China University of Mining and Technology, Beijing 100083, China
Yan Wang: Big Data and Internet of Things Research Center, China University of Mining and Technology, Beijing 100083, China
Haizhen Xie: Big Data and Internet of Things Research Center, China University of Mining and Technology, Beijing 100083, China
Jiayue Chen: Big Data and Internet of Things Research Center, China University of Mining and Technology, Beijing 100083, China
Shulong Sun: Big Data and Internet of Things Research Center, China University of Mining and Technology, Beijing 100083, China
Xiaolan Zhang: Big Data and Internet of Things Research Center, China University of Mining and Technology, Beijing 100083, China

DOI: https://doi.org/10.3390/math13040660
Journal volume & issue: Vol. 13, no. 4
p. 660

Abstract

Read online

This study addresses the cross-scenario challenges in pedestrian re-identification for public safety, including perspective differences, lighting variations, occlusions, and vague feature expressions. We propose a pedestrian re-identification method called EHAFF-NET, which integrates an enhanced hybrid attention mechanism and multi-branch feature fusion. We introduce the Enhanced Hybrid Attention Module (EHAM), which combines channel and spatial attention mechanisms. The channel attention mechanism uses self-attention to capture long-range dependencies and extracts multi-scale local features with convolutional kernels and channel shuffling. The spatial attention mechanisms aggregate features using global average and max pooling to enhance spatial representation. To tackle issues like perspective differences, lighting changes, and occlusions, we incorporate the Multi-Branch Feature Integration module. The global branch captures overall information with global average pooling, while the local branch integrates features from different layers via the Diverse-Depth Feature Integration Module (DDFIM) to extract multi-scale semantic information. It also extracts features based on human proportions, balancing high-level semantics and low-level details. Experiments demonstrate that our model achieves a mAP of 92.5% and R1 of 94.7% on the Market-1501 dataset, a mAP of 85.4% and R1 of 88.6% on the DukeMTMC-reID dataset, and a mAP of 49.1% and R1 of 73.8% on the MSMT17 dataset, demonstrating significant accuracy advantages over several advanced models.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords