Jointly human semantic parsing and attribute recognition with feature pyramid structure in EfficientNets

Mahnaz Moghaddam; Mostafa Charmi; Hossein Hassanpoor

doi:10.1049/ipr2.12195

IET Image Processing (Aug 2021)

Jointly human semantic parsing and attribute recognition with feature pyramid structure in EfficientNets

Mahnaz Moghaddam,
Mostafa Charmi,
Hossein Hassanpoor

Affiliations

Mahnaz Moghaddam: Department of Electrical Engineering Faculty of Engineering University of Zanjan Zanjan Iran
Mostafa Charmi: Department of Electrical Engineering Faculty of Engineering University of Zanjan Zanjan Iran
Hossein Hassanpoor: Department of Computational Neuroscience Dade Pardazi Shenakht Mehvare Atynegar Tehran Iran

DOI: https://doi.org/10.1049/ipr2.12195
Journal volume & issue: Vol. 15, no. 10
pp. 2281 – 2291

Abstract

Read online

Abstract Pedestrian attributes recognition is an important issue in computer vision and has a special role in the field of video surveillance. The previous methods presented to solve this issue are mainly based on multi‐label end‐to‐end deep neural networks. These methods neglect to apply attributes for defining local feature areas and they suffer from the problems of the bounding box presence. A new framework for jointly human semantic parsing and pedestrian attribute recognition to achieve effective attribute recognition is proposed. By extracting human parts via semantic parsing, both semantic and spatial information can be explored with eliminating of background. The framework also uses multi‐scale features to employ rich details and contextual information through proposed attribute recognition‐bidirectional feature pyramid network. For baseline network that has a significant impact on the performance, EfficientNet‐B3 is selected as a baseline network from The EfficientNet family which provides an appropriate trade‐off between the three factors of CNNs scaling (depth/width/resolution). Finally, the proposed framework is tested on datasets PETA, RAP and PA‐100k. Experimental results show that our method has superior performance in both mean accuracy and instance‐based metrics compared to state‐of‐the‐art results.

Published in IET Image Processing

ISSN: 1751-9659 (Print); 1751-9667 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Technology: Photography; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519667

About the journal

Abstract

Keywords