Mathematics (Feb 2025)
EHAFF-NET: Enhanced Hybrid Attention and Feature Fusion for Pedestrian ReID
Abstract
This study addresses the cross-scenario challenges in pedestrian re-identification for public safety, including perspective differences, lighting variations, occlusions, and vague feature expressions. We propose a pedestrian re-identification method called EHAFF-NET, which integrates an enhanced hybrid attention mechanism and multi-branch feature fusion. We introduce the Enhanced Hybrid Attention Module (EHAM), which combines channel and spatial attention mechanisms. The channel attention mechanism uses self-attention to capture long-range dependencies and extracts multi-scale local features with convolutional kernels and channel shuffling. The spatial attention mechanisms aggregate features using global average and max pooling to enhance spatial representation. To tackle issues like perspective differences, lighting changes, and occlusions, we incorporate the Multi-Branch Feature Integration module. The global branch captures overall information with global average pooling, while the local branch integrates features from different layers via the Diverse-Depth Feature Integration Module (DDFIM) to extract multi-scale semantic information. It also extracts features based on human proportions, balancing high-level semantics and low-level details. Experiments demonstrate that our model achieves a mAP of 92.5% and R1 of 94.7% on the Market-1501 dataset, a mAP of 85.4% and R1 of 88.6% on the DukeMTMC-reID dataset, and a mAP of 49.1% and R1 of 73.8% on the MSMT17 dataset, demonstrating significant accuracy advantages over several advanced models.
Keywords