Image Stream From Paragraph Method Based on Scene Graph

ZHANG Wei-qi, TANG Yi-feng, LI Lin-yan, HU Fu-yuan

doi:10.11896/jsjkx.201100207

Jisuanji kexue (Jan 2022)

Image Stream From Paragraph Method Based on Scene Graph

ZHANG Wei-qi, TANG Yi-feng, LI Lin-yan, HU Fu-yuan

Affiliations

ZHANG Wei-qi, TANG Yi-feng, LI Lin-yan, HU Fu-yuan: 1 School of Electronic Information Engineering,Suzhou University of Science and Technology,Suzhou,Jiangsu 215009,China<br/>2 Suzhou Key Laboratory for Big Data and Information Service,Suzhou University of Science and Technology,Suzhou,Jiangsu 215009,China<br/>3 Suzhou Institute of Trade and Commerce,Suzhou,Jiangsu 215009,China<br/>4 Virtual Reality Key Laboratory of Intelligent Interaction and Application Technology of Suzhou,Suzhou University of Science and Technology,Suzhou,Jiangsu 215009,China

DOI: https://doi.org/10.11896/jsjkx.201100207
Journal volume & issue: Vol. 49, no. 1
pp. 233 – 240

Abstract

Read online

The task of generating sequence images from paragraphs by generating confrontation networks can already generate higher quality images.However,when the input text involves multiple objects and relationships,the context information of the text sequence is difficult to extract,the object layout of the generated image is prone to confusion,and the generated object details are insufficient.To solve this problem,this paper proposes a method of generating sequence images based on scene graphs based on StoryGAN.First,the paragraph is converted into multiple scene graphs through graph convolution,each scene graph contains the object and relationship information of the corresponding text.Then,the bounding box and segmentation mask of the object are predicted to calculate the scene layout.Finally,according to the scene layout and the context information,a sequence of images more in line with the object and its relationship is generated.Tests on CLEVR-SV and CoDraw-SV data sets show that the me-thod in this paper can generate 64×64-pixel sequence images containing multiple objects and their relationships.Experimental results show that on the CLEVR-SV data set,the SSIM and FID of this method are improved by 1.34% and 9.49% respectively than StoryGAN.On the CoDraw-SV data set,the ACC of this method is 7.40% higher than that of StoryGAN.The proposed method improves the rationality of the layout of the generated scene,not only can generate an image sequence containing multiple objects and relationships,but also the generated image has higher quality and clearer details.

generative adversarial networks|graph convolutional network|scene layout|text-to-image synthesis

Published in Jisuanji kexue

ISSN: 1002-137X (Print)
Publisher: Editorial office of Computer Science
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software; Technology: Technology (General)
Website: http://www.jsjkx.com/CN/1002-137X/home.shtml

About the journal

Abstract

Keywords