IEEE Access (Jan 2024)
Locally Conditioned GANs: Self-Supervised Local Patch Representation Learning for Conditional Generation
Abstract
Existing conditional generation models using generative adversarial networks (GANs) suffer from two common limitations: 1) they heavily rely on supervision, or 2) their performance is favorable to the scenario of creating only small changes. This study aims to address both issues by introducing new locally conditioned generative adversarial networks (LCGAN). Inspired by self-supervised representation learning, we devise intuitive learning signals and training tactics to learn the local patch encoding for developing the locally controllable latent space of GANs. Powered by local patch encoding with our novel loss design, the proposed model successfully performs locally conditioned image generation while covering various attributes. Utilizing LCGAN, ordinary users can easily design an image by browsing its patch-level appearance from various patch examples, even including out-of-domain examples. Besides, LCGAN, with latent optimization, offers high-quality results in local editing. Experimental evaluations verify that our model is effective in both conditional generation and local editing in achieving both image quality and fidelity. Our method is the most preferred by 55.78% of user study participants, and it achieved Fréchet inception distance scores of 16.24 and 15.01 on the FFHQ and AFHQ-cat datasets, respectively. Especially, a comprehensive user study supports that: 1) trade-off between quality and fidelity exists in existing methods and 2) our model is the first to alleviate their trade-off relationships, showing the potential in practical image editing applications.
Keywords