Locally Conditioned GANs: Self-Supervised Local Patch Representation Learning for Conditional Generation

Dongseob Kim; Hyunjung Shim

doi:10.1109/ACCESS.2024.3418884

IEEE Access (Jan 2024)

Locally Conditioned GANs: Self-Supervised Local Patch Representation Learning for Conditional Generation

Dongseob Kim,
Hyunjung Shim

Affiliations

Dongseob Kim: ORCiD; School of Integrated Technology, Yonsei University, Yeonsu-gu, Incheon, Republic of Korea
Hyunjung Shim: ORCiD; Kim Jaechul Graduate School of Artificial Intelligence, Korea Advanced Institute of Science and Technology (KAIST), Dongdaemun-gu, Seoul, Republic of Korea

DOI: https://doi.org/10.1109/ACCESS.2024.3418884
Journal volume & issue: Vol. 12
pp. 134115 – 134132

Abstract

Read online

Existing conditional generation models using generative adversarial networks (GANs) suffer from two common limitations: 1) they heavily rely on supervision, or 2) their performance is favorable to the scenario of creating only small changes. This study aims to address both issues by introducing new locally conditioned generative adversarial networks (LCGAN). Inspired by self-supervised representation learning, we devise intuitive learning signals and training tactics to learn the local patch encoding for developing the locally controllable latent space of GANs. Powered by local patch encoding with our novel loss design, the proposed model successfully performs locally conditioned image generation while covering various attributes. Utilizing LCGAN, ordinary users can easily design an image by browsing its patch-level appearance from various patch examples, even including out-of-domain examples. Besides, LCGAN, with latent optimization, offers high-quality results in local editing. Experimental evaluations verify that our model is effective in both conditional generation and local editing in achieving both image quality and fidelity. Our method is the most preferred by 55.78% of user study participants, and it achieved Fréchet inception distance scores of 16.24 and 15.01 on the FFHQ and AFHQ-cat datasets, respectively. Especially, a comprehensive user study supports that: 1) trade-off between quality and fidelity exists in existing methods and 2) our model is the first to alleviate their trade-off relationships, showing the potential in practical image editing applications.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords