IEEE Access (Jan 2024)

A Video Target Re-Recognition Method Based on Adaptive Attention Enhancement and Multi-Scale Feature Fusion

  • Zhiming Xu,
  • Jinhuang Chen,
  • Zhaoqi Chen,
  • Jiajun Zou,
  • Mengbo Wang,
  • Zemin Qiu

DOI
https://doi.org/10.1109/ACCESS.2023.3339841
Journal volume & issue
Vol. 12
pp. 9392 – 9399

Abstract

Read online

Within the realm of computer vision, the task of re-identifying targets across multiple video frames has emerged as a pivotal challenge, particularly in domains like video surveillance, smart transportation systems, and pedestrian flow analytics. Conventional re-identification techniques often grapple with constraints stemming from varying camera perspectives, inconsistent lighting conditions, and prevalent occlusions. Addressing these challenges, this research introduces MVF-Re, a sophisticated re-identification approach that synergizes adaptive attention mechanisms with multi-scale feature fusion. Initially, we architect a deep attention-enhanced feature pyramid network, a pioneering framework that dynamically tailors itself to video frame content, thereby capturing intricate target details. Subsequently, we incorporate a multi-input Siamese network, ensuring the derivation of consistent and resilient feature sets across diverse contexts. To augment feature distinctiveness, we conceptualize a context-sensitive dynamic attention mechanism, adept at judiciously allocating weights to individual video frames. Culminating our approach, we deploy an innovative multi-scale feature fusion methodology, offering a holistic and robust target representation. Empirical evaluations on multiple benchmark datasets underscore the superior performance of our methodology, underscoring its proficiency in multi-frame target re-identification.

Keywords