Instance Mask Embedding and Attribute-Adaptive Generative Adversarial Network for Text-to-Image Synthesis

Jiancheng Ni; Susu Zhang; Zili Zhou; Jie Hou; Feng Gao

doi:10.1109/ACCESS.2020.2975841

IEEE Access (Jan 2020)

Instance Mask Embedding and Attribute-Adaptive Generative Adversarial Network for Text-to-Image Synthesis

Jiancheng Ni,
Susu Zhang,
Zili Zhou,
Jie Hou,
Feng Gao

Affiliations

Jiancheng Ni: ORCiD; School of Software, Qufu Normal University, Qufu, China
Susu Zhang: ORCiD; School of Software, Qufu Normal University, Qufu, China
Zili Zhou: School of Software, Qufu Normal University, Qufu, China
Jie Hou: School of Software, Qufu Normal University, Qufu, China
Feng Gao: School of Software, Qufu Normal University, Qufu, China

DOI: https://doi.org/10.1109/ACCESS.2020.2975841
Journal volume & issue: Vol. 8
pp. 37697 – 37711

Abstract

Read online

Existing image generation models have achieved the synthesis of reasonable individuals and complex but low-resolution images. Directly from complicated text to high-resolution image generation still remains a challenge. To this end, we propose the instance mask embedding and attribute-adaptive generative adversarial network (IMEAA-GAN). Firstly, we use the box regression network to compute a global layout containing the class labels and locations for each instance. Then the global generator encodes the layout, combines the whole text embedding and noise to preliminarily generate a low-resolution image; the instance embedding mechanism is used firstly to guide local refinement generators obtain fine-grained local features and generate a more realistic image. Finally, in order to synthesize the exact visual attributes, we introduce the multi-scale attribute-adaptive discriminator, which provides local refinement generators with the specific training signals to explicitly generate instance-level features. Extensive experiments based on the MS-COCO dataset and the Caltech-UCSD Birds-200-2011 dataset show that our model can obtain globally consistent attributes and generate complex images with local texture details.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords