Optimizing Prompts Using In-Context Few-Shot Learning for Text-to-Image Generative Models

Seunghun Lee; Jihoon Lee; Chan Ho Bae; Myung-Seok Choi; Ryong Lee; Sangtae Ahn

doi:10.1109/ACCESS.2023.3348778

IEEE Access (Jan 2024)

Optimizing Prompts Using In-Context Few-Shot Learning for Text-to-Image Generative Models

Seunghun Lee,
Jihoon Lee,
Chan Ho Bae,
Myung-Seok Choi,
Ryong Lee,
Sangtae Ahn

Affiliations

Seunghun Lee: ORCiD; School of Electronic and Electrical Engineering, Kyungpook National University, Daegu, South Korea
Jihoon Lee: ORCiD; School of Electronic and Electrical Engineering, Kyungpook National University, Daegu, South Korea
Chan Ho Bae: ORCiD; School of Electronic and Electrical Engineering, Kyungpook National University, Daegu, South Korea
Myung-Seok Choi: ORCiD; AI Data Research Center, Korea Institute of Science and Technology Information (KISTI), Daejeon, South Korea
Ryong Lee: ORCiD; AI Data Research Center, Korea Institute of Science and Technology Information (KISTI), Daejeon, South Korea
Sangtae Ahn: ORCiD; School of Electronic and Electrical Engineering, Kyungpook National University, Daegu, South Korea

DOI: https://doi.org/10.1109/ACCESS.2023.3348778
Journal volume & issue: Vol. 12
pp. 2660 – 2673

Abstract

Read online

Recently, various text-to-image generative models have been released, demonstrating their ability to generate high-quality synthesized images from text prompts. Despite these advancements, determining the appropriate text prompts to obtain desired images remains challenging. The quality of the synthesized images heavily depends on the user input, making it difficult to achieve consistent and satisfactory results. This limitation has sparked the need for an effective prompt optimization method to generate optimized text prompts automatically for text-to-image generative models. Thus, this study proposes a prompt optimization method that uses in-context few-shot learning in a pretrained language model. The proposed approach aims to generate optimized text prompts to guide the image synthesis process by leveraging the available contextual information in a few text examples. The results revealed that synthesized images using the proposed prompt optimization method achieved a higher performance, at 18% on average, based on an evaluation metric that measures the similarity between the generated images and prompts for generation. The significance of this research lies in its potential to provide a more efficient and automated approach to obtaining high-quality synthesized images. The findings indicate that prompt optimization may offer a promising pathway for text-to-image generative models.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords