IEEE Access (Jan 2024)

DuDGAN: Improving Class-Conditional GANs via Dual-Diffusion

  • Taesun Yeom,
  • Chanhoe Gu,
  • Minhyeok Lee

DOI
https://doi.org/10.1109/ACCESS.2024.3372996
Journal volume & issue
Vol. 12
pp. 39651 – 39661

Abstract

Read online

Class-conditional image generation using generative adversarial networks (GANs) has been investigated through various techniques; however, it continues to face challenges such as mode collapse, training instability, and low-quality output in cases of datasets with high intra-class variation. Furthermore, most GANs often converge in larger iterations, resulting in poor iteration efficacy in training procedures. While Diffusion-GAN has shown potential in generating realistic samples, it has a critical limitation in generating class-conditional samples. To overcome these limitations, we propose a novel approach for class-conditional image generation using GANs called DuDGAN, which incorporates a dual diffusion-based noise injection process. DuDGAN consists of three unique networks: a discriminator, a generator, and a classifier. During the training process, Gaussian-mixture noises are injected into the two noise-aware networks, the discriminator and the classifier, in distinct ways. This noisy data helps to prevent overfitting by gradually introducing more challenging tasks, leading to improved model performance. As a result, DuDGAN outperforms state-of-the-art conditional GAN models for image generation in terms of performance. We evaluated DuDGAN using the AFHQ, Food-101, CIFAR-10, and BAAT datasets and observed superior results across metrics such as FID, KID, Precision, and Recall score compared with comparison models; FID decreases 12.9% and 5.1% on average for AFHQ and CIFAR-10, respectively, highlighting the effectiveness of the proposed approach.

Keywords