Attention-Guided Hierarchical Parsing for Fine-Grained Person-Centric Image Captioning

Zhengcheng Gu; Jing Jin

doi:10.1109/ACCESS.2024.3416207

IEEE Access (Jan 2024)

Attention-Guided Hierarchical Parsing for Fine-Grained Person-Centric Image Captioning

Zhengcheng Gu,
Jing Jin

Affiliations

Zhengcheng Gu: ORCiD; School of Computer and Information Engineering (Artificial Intelligence), Nanjing Tech University, Nanjing, Jiangsu, China
Jing Jin: ORCiD; School of Computer and Information Engineering (Artificial Intelligence), Nanjing Tech University, Nanjing, Jiangsu, China

DOI: https://doi.org/10.1109/ACCESS.2024.3416207
Journal volume & issue: Vol. 12
pp. 86293 – 86301

Abstract

Read online

Although significant progress in the task of producing fine-grained captions for portrait images has been made by the current models for generating detailed descriptions in captions, they still face challenges in attention allocation and in capturing the detailed characteristics of the subjects. This results in a difficulty to accurately generate refined captions for character images. In response to this issue, a model named Attention-guided Hierarchical Parsing (AHP) is innovatively proposed by us. This model leverages the exceptional segmentation performance of the Segment Anything Model (SAM) to guide the model to prioritize key information in character images, maintaining focus on the subject even in complex scenes. Additionally, the model utilizes a multi-level image feature encoding-decoding framework, significantly enhancing its capacity to capture intricate image details through a thorough analysis of multi-scale features within images. Extensive experimental results demonstrate the superior performance of the proposed model in generating fine-grained, high-quality captions, significantly improving the quality of image caption generation and introducing new perspectives and methods to the field of fine-grained image caption generation.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords