Semantic-Guided Selective Representation for Image Captioning

Yinan Li; Yiwei Ma; Yiyi Zhou; Xiao Yu

doi:10.1109/ACCESS.2023.3243952

IEEE Access (Jan 2023)

Semantic-Guided Selective Representation for Image Captioning

Yinan Li,
Yiwei Ma,
Yiyi Zhou,
Xiao Yu

Affiliations

Yinan Li: ORCiD; Media Analytics and Computing Laboratory, Department of Artificial Intelligence, School of Informatics, Xiamen University, Xiamen, China
Yiwei Ma: ORCiD; Media Analytics and Computing Laboratory, Department of Artificial Intelligence, School of Informatics, Xiamen University, Xiamen, China
Yiyi Zhou: ORCiD; Media Analytics and Computing Laboratory, Department of Artificial Intelligence, School of Informatics, Xiamen University, Xiamen, China
Xiao Yu: ORCiD; Digital Governance Laboratory, Sichuan Administration Institute, Chengdu, China

DOI: https://doi.org/10.1109/ACCESS.2023.3243952
Journal volume & issue: Vol. 11
pp. 14500 – 14510

Abstract

Read online

Grid-based features have been proven to be as effective as region-based features in multi-modal tasks such as visual question answering. However, its application to image captioning encounters two main issues, namely, noisy features and fragmented semantics. In this paper, we propose a novel feature selection scheme, with a Relation-Aware Selection (RAS) and a Fine-grained Semantic Guidance (FSG) learning strategy. Based on the grid-wise interactions, RAS can enhance the salient visual regions and channels, and suppress the less important ones. In addition, this selection process is guided by FSG, which uses fine-grained semantic knowledge to supervise the selection process. Experimental results on the MS COCO show the proposed RAS-FSG scheme achieves state-of-the-art performance on both the off-line and on-line testing, i.e., 134.3 CIDEr for the off-line testing and 135.4 for the on-line testing of MSCOCO. Extensive ablation studies and visualizations also validate the effectiveness of our scheme.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords