AF-CPACNet: AnchorFree Crowd Parsing Attention-Based Characteristic Segmentation Network

S. Raghavendra; S. K. Abhilash; Venu Madhav Nookala; Yashwanth Nanjappa

doi:10.1109/ACCESS.2024.3501394

IEEE Access (Jan 2024)

AF-CPACNet: AnchorFree Crowd Parsing Attention-Based Characteristic Segmentation Network

S. Raghavendra,
S. K. Abhilash,
Venu Madhav Nookala,
Yashwanth Nanjappa

Affiliations

S. Raghavendra: ORCiD; Department of Information and Communication Technology, Manipal Academy of Higher Education, Manipal Institute of Technology, Manipal, India
S. K. Abhilash: ORCiD; KPIT Technologies, Bengaluru, India
Venu Madhav Nookala: ORCiD; KPIT Technologies, Bengaluru, India
Yashwanth Nanjappa: ORCiD; Department of Electronics and Communication Engineering, Manipal Academy of Higher Education, Manipal Institute of Technology, Manipal, India

DOI: https://doi.org/10.1109/ACCESS.2024.3501394
Journal volume & issue: Vol. 12
pp. 171706 – 171717

Abstract

Read online

Multi-human parsing involves the task of segmenting and identifying different human parts within images that contain multiple people. It is a crucial task in computer vision, particularly for applications such as human pose estimation, scene understanding, and virtual reality. This paper explores the various features and techniques used in multi-human parsing, including the use of deep learning models like convolutional neural networks (CNNs) and attention mechanisms to accurately detect and segment human body parts in crowded or complex environments. Anchor boxes often fail to capture the diverse variations in human body shapes and poses accurately, leading to suboptimal performance in human parsing tasks. To address these limitations, we introduce AF-CPACNet, a novel model that eliminates the need for anchor boxes by adopting a multi-head and multi-task architecture. AF-CPACNet consists of two key components: a detection head and an edge-guided parsing module, enabling pixel-level analysis and improving the precision of human body part segmentation. Additionally, a refinement head is incorporated to further enhance semantic parsing quality. The model captures finer details of human body parts by considering color, size, and pattern attributes in a single forward pass while operating in real-time. A specialized loss function is employed to optimize semantic parsing results and improve training efficiency. We evaluate the performance of AF-CPACNet on multiple human parsing datasets, including CCIHP and CIHP, and demonstrate that it significantly outperforms existing state-of-the-art methods. Specifically, AF-CPACNet achieves an 11% improvement on the CIHP dataset and an mIoU of 67.3 on the CCIHP dataset, across both global and instance-level metrics. The open-source code is available at https://github.com/abhigoku10/AF-CPACNet.git.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords