Jisuanji kexue (Jan 2022)
Image Stream From Paragraph Method Based on Scene Graph
Abstract
The task of generating sequence images from paragraphs by generating confrontation networks can already generate higher quality images.However,when the input text involves multiple objects and relationships,the context information of the text sequence is difficult to extract,the object layout of the generated image is prone to confusion,and the generated object details are insufficient.To solve this problem,this paper proposes a method of generating sequence images based on scene graphs based on StoryGAN.First,the paragraph is converted into multiple scene graphs through graph convolution,each scene graph contains the object and relationship information of the corresponding text.Then,the bounding box and segmentation mask of the object are predicted to calculate the scene layout.Finally,according to the scene layout and the context information,a sequence of images more in line with the object and its relationship is generated.Tests on CLEVR-SV and CoDraw-SV data sets show that the me-thod in this paper can generate 64×64-pixel sequence images containing multiple objects and their relationships.Experimental results show that on the CLEVR-SV data set,the SSIM and FID of this method are improved by 1.34% and 9.49% respectively than StoryGAN.On the CoDraw-SV data set,the ACC of this method is 7.40% higher than that of StoryGAN.The proposed method improves the rationality of the layout of the generated scene,not only can generate an image sequence containing multiple objects and relationships,but also the generated image has higher quality and clearer details.
Keywords