Visualizing Ambiguity: Analyzing Linguistic Ambiguity Resolution in Text-to-Image Models

Wala Elsharif; Mahmood Alzubaidi; James She; Marco Agus

doi:10.3390/computers14010019

Computers (Jan 2025)

Visualizing Ambiguity: Analyzing Linguistic Ambiguity Resolution in Text-to-Image Models

Wala Elsharif,
Mahmood Alzubaidi,
James She,
Marco Agus

Affiliations

Wala Elsharif: College of Science and Engineering, Hamad Bin Khalifa University, Doha 34110, Qatar
Mahmood Alzubaidi: College of Science and Engineering, Hamad Bin Khalifa University, Doha 34110, Qatar
James She: Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong 999077, China
Marco Agus: College of Science and Engineering, Hamad Bin Khalifa University, Doha 34110, Qatar

DOI: https://doi.org/10.3390/computers14010019
Journal volume & issue: Vol. 14, no. 1
p. 19

Abstract

Read online

Text-to-image models have demonstrated remarkable progress in generating visual content from textual descriptions. However, the presence of linguistic ambiguity in the text prompts poses a potential challenge to these models, possibly leading to undesired or inaccurate outputs. This work conducts a preliminary study and provides insights into how text-to-image diffusion models resolve linguistic ambiguity through a series of experiments. We investigate a set of prompts that exhibit different types of linguistic ambiguities with different models and the images they generate, focusing on how the models’ interpretations of linguistic ambiguity compare to those of humans. In addition, we present a curated dataset of ambiguous prompts and their corresponding images known as the Visual Linguistic Ambiguity Benchmark (V-LAB) dataset. Furthermore, we report a number of limitations and failure modes caused by linguistic ambiguity in text-to-image models and propose prompt engineering guidelines to minimize the impact of ambiguity. The findings of this exploratory study contribute to the ongoing improvement of text-to-image models and provide valuable insights for future advancements in the field.

Published in Computers

ISSN: 2073-431X (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://www.mdpi.com/journal/computers

About the journal

Abstract

Keywords