IET Computer Vision (Dec 2023)

Attribute‐guided transformer for robust person re‐identification

  • Zhe Wang,
  • Jun Wang,
  • Junliang Xing

DOI
https://doi.org/10.1049/cvi2.12215
Journal volume & issue
Vol. 17, no. 8
pp. 977 – 992

Abstract

Read online

Abstract Recent studies reveal the crucial role of local features in learning robust and discriminative representations for person re‐identification (Re‐ID). Existing approaches typically rely on external tasks, for example, semantic segmentation, or pose estimation, to locate identifiable parts of given images. However, they heuristically utilise the predictions from off‐the‐shelf models, which may be sub‐optimal in terms of both local partition and computational efficiency. They also ignore the mutual information with other inputs, which weakens the representation capabilities of local features. In this study, the authors put forward a novel Attribute‐guided Transformer (AiT), which explicitly exploits pedestrian attributes as semantic priors for discriminative representation learning. Specifically, the authors first introduce an attribute learning process, which generates a set of attention maps highlighting the informative parts of pedestrian images. Then, the authors design a Feature Diffusion Module (FDM) to iteratively inject attribute information into global feature maps, aiming at suppressing unnecessary noise and inferring attribute‐aware representations. Last, the authors propose a Feature Aggregation Module (FAM) to exploit mutual information for aggregating attribute characteristics from different images, enhancing the representation capabilities of feature embedding. Extensive experiments demonstrate the superiority of our AiT in learning robust and discriminative representations. As a result, the authors achieve competitive performance with state‐of‐the‐art methods on several challenging benchmarks without any bells and whistles.

Keywords