Explicit Image Caption Reasoning: Generating Accurate and Informative Captions for Complex Scenes with LMM

Mingzhang Cui; Caihong Li; Yi Yang

doi:10.3390/s24123820

Sensors (Jun 2024)

Explicit Image Caption Reasoning: Generating Accurate and Informative Captions for Complex Scenes with LMM

Mingzhang Cui,
Caihong Li,
Yi Yang

Affiliations

Mingzhang Cui: School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China
Caihong Li: Key Laboratory of Artificial Intelligence and Computing Power Technology, Lanzhou 730000, China
Yi Yang: School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China

DOI: https://doi.org/10.3390/s24123820
Journal volume & issue: Vol. 24, no. 12
p. 3820

Abstract

Read online

The rapid advancement of sensor technologies and deep learning has significantly advanced the field of image captioning, especially for complex scenes. Traditional image captioning methods are often unable to handle the intricacies and detailed relationships within complex scenes. To overcome these limitations, this paper introduces Explicit Image Caption Reasoning (ECR), a novel approach that generates accurate and informative captions for complex scenes captured by advanced sensors. ECR employs an enhanced inference chain to analyze sensor-derived images, examining object relationships and interactions to achieve deeper semantic understanding. We implement ECR using the optimized ICICD dataset, a subset of the sensor-oriented Flickr30K-EE dataset containing comprehensive inference chain information. This dataset enhances training efficiency and caption quality by leveraging rich sensor data. We create the Explicit Image Caption Reasoning Multimodal Model (ECRMM) by fine-tuning TinyLLaVA with the ICICD dataset. Experiments demonstrate ECR’s effectiveness and robustness in processing sensor data, outperforming traditional methods.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords