Efficient Deep Learning Approach to Recognize Person Attributes by Using Hybrid Transformers for Surveillance Scenarios

S. Raghavendra; Ramyashree; S. K. Abhilash; Venu Madhav Nookala; S. Kaliraj

doi:10.1109/ACCESS.2023.3241334

IEEE Access (Jan 2023)

Efficient Deep Learning Approach to Recognize Person Attributes by Using Hybrid Transformers for Surveillance Scenarios

S. Raghavendra,
Ramyashree,
S. K. Abhilash,
Venu Madhav Nookala,
S. Kaliraj

Affiliations

S. Raghavendra: ORCiD; Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
Ramyashree: ORCiD; Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
S. K. Abhilash: PathPartner Technology Private Ltd., Bengaluru, India
Venu Madhav Nookala: PathPartner Technology Private Ltd., Bengaluru, India
S. Kaliraj: ORCiD; Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India

DOI: https://doi.org/10.1109/ACCESS.2023.3241334
Journal volume & issue: Vol. 11
pp. 10881 – 10893

Abstract

Read online

Numerous deep perception technologies and methods are built on the foundation of pedestrian feature identification. It covers various fields, including autonomous driving, spying, and object tracking. A recent study area is the identification of personality traits that has attracted much interest in video surveillance. Identifying a person’s distinct areas is complex and plays an incredibly significant role. This paper presents a current method applied to networks of primary convolutional neurons to locate the area connected to the Person attribute. Using Individual Feature Identification, the features of a person, such as gender, age, fashion sense, and equipment, have received much attention in video surveillance analytics. This Article adopted a Conv-Attentional image transformer that broke down the most discriminating Attribute and region into multiple grades. The feed-forward system and conv-attention are the components of serial blocks, and parallel blocks have two attention-focused tactics: direct cross-layer attention and feature interpolation. It also provides a flexible Attribute Localization Module (ALM) to learn the regional aspects of each Attribute are considered at several levels, and the most discriminating areas are selected adaptively. We draw the conclusion that hybrid transformers outperform pure transformers in this instance. The extensive experimental results indicate that the proposed hybrid technique achieves higher results than the current strategies on four unique private characteristic datasets, i.e., RapV2, RapV1, PETA, and PA100K.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords