Pose-Guided Spatial Alignment and Key Frame Selection for One-Shot Video-Based Person Re-Identification

Yuzhong Chen; Tengda Huang; Yuzhen Niu; Xiao Ke; Yangyang Lin

doi:10.1109/access.2019.2922679

IEEE Access (Jan 2019)

Pose-Guided Spatial Alignment and Key Frame Selection for One-Shot Video-Based Person Re-Identification

Yuzhong Chen,
Tengda Huang,
Yuzhen Niu,
Xiao Ke,
Yangyang Lin

Affiliations

Yuzhong Chen: ORCiD; Fujian Key Laboratory of Network Computing and Intelligent Information Processing, College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China
Tengda Huang: Fujian Key Laboratory of Network Computing and Intelligent Information Processing, College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China
Yuzhen Niu: ORCiD; Fujian Key Laboratory of Network Computing and Intelligent Information Processing, College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China
Xiao Ke: ORCiD; Fujian Key Laboratory of Network Computing and Intelligent Information Processing, College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China
Yangyang Lin: Fujian Key Laboratory of Network Computing and Intelligent Information Processing, College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China

DOI: https://doi.org/10.1109/access.2019.2922679
Journal volume & issue: Vol. 7
pp. 78991 – 79004

Abstract

Read online

One-shot video-based person re-identification exploits the unlabeled data by using a single-labeled sample for each individual to train a model and to reduce the need for laborious labeling. Although recent works focusing on this task have made some achievements, most state-of-the-art models are vulnerable to misalignment, pose variation and corrupted frames. To address these challenges, we propose a one-shot video-based person re-identification model based on pose-guided spatial alignment and KFS. First, a spatial transformer sub-network trained using pose-guided regression is employed to perform the spatial alignment. Second, we propose a novel training strategy based on KFS. Key frames with abruptly changing poses are deliberately identified and selected to make the network adaptive to pose variation. Finally, we propose a frame feature pooling method by incorporating long short-term memory with an attention mechanism to reduce the influence of corrupted frames. Comprehensive experiments are presented based on the MARS and DukeMTMC-VideoReID datasets. The mAP values for these datasets reach 46.5% and 68.4%, respectively, demonstrating that the proposed model achieves significant improvements over state-of-the-art one-shot person re-identification methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords