A Novel Scheme for Managing Multiple Context Transitions While Ensuring Consistency in Text-to-Image Generative Artificial Intelligence

Hyunjo Kim; Jae-Ho Choi; Jin-Young Choi

doi:10.1109/ACCESS.2024.3476933

IEEE Access (Jan 2024)

A Novel Scheme for Managing Multiple Context Transitions While Ensuring Consistency in Text-to-Image Generative Artificial Intelligence

Hyunjo Kim,
Jae-Ho Choi,
Jin-Young Choi

Affiliations

Hyunjo Kim: ORCiD; School of Cybersecurity, Korea University, Seoul, South Korea
Jae-Ho Choi: ORCiD; MCCAAI Company Ltd., Seoul, South Korea
Jin-Young Choi: ORCiD; School of Cybersecurity, Korea University, Seoul, South Korea

DOI: https://doi.org/10.1109/ACCESS.2024.3476933
Journal volume & issue: Vol. 12
pp. 150468 – 150484

Abstract

Read online

Humans possess an astonishing ability to understand stories presented in text and to create related images through imagination. This cognitive ability aids in comprehension and enhances overall enjoyment. Consequently, developing automated systems that generate visually faithful images based on textual descriptions is considered a meaningful endeavor. As a result, many artificial intelligence (AI) systems for generating images from text have been developed. In previous research, we presented a study on generating images while maintaining the context of input sentences when multiple sentences are input. In this paper, we propose that when dealing with more structured and numerous sentences, such as those found in novels, essays, or papers, it is essential not only to maintain the consistency of the context but also to address the complex challenge of transitioning between different contexts, which cannot be resolved by merely dividing sentences into paragraphs. We introduce the Structured Context Retention Methods (SCRM) scheme, which reflects the user’s intentions for both context retention and smooth transitions across varying narrative elements. Additionally, through experiments, we demonstrate that the SCRM technique performs well in terms of ROUGE recall, effectively managing a large number of input sentences and context transitions.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords