IEEE Access (Jan 2023)

Designing Practical End-to-End System for Soft Biometric-Based Person Retrieval From Surveillance Videos

  • Jay N. Chaudhari,
  • Hiren Galiyawala,
  • Minoru Kuribayashi,
  • Paawan Sharma,
  • Mehul S. Raval

DOI
https://doi.org/10.1109/ACCESS.2023.3337108
Journal volume & issue
Vol. 11
pp. 133640 – 133657

Abstract

Read online

Video surveillance improves public safety by preventing and sensing criminal activity, enhancing quick counteractions, and presenting evidence to investigators. This is effectively performed by firing a natural language query containing soft biometrics to retrieve a person from a video. State-of-the-art (SOTA) approaches focus on improving retrieval results; thus, the building blocks of any person retrieval system are not accorded due attention, putting novice researchers at a disadvantage. This study aims to provide a design methodology by showcasing the block-by-block construction of a person retrieval system using video and natural language. For each subsystem - natural language processing, person detection, attribute recognition, and ranking- we discuss the available design selections, provide empirical evidence, and discuss bottlenecks and solutions. We thereafter select and integrate the best choices to create an end-to-end system. We highlight the integration challenges and demonstrate that the proposed method achieves an average intersection over union and the true positive rate of $\geq 60\%$ . This is the first study to provide practical guidance to researchers for fast prototyping of person retrieval with subsystem-level understanding and achieve SOTA performance.

Keywords