IEEE Access (Jan 2024)
DuDGAN: Improving Class-Conditional GANs via Dual-Diffusion
Abstract
Class-conditional image generation using generative adversarial networks (GANs) has been investigated through various techniques; however, it continues to face challenges such as mode collapse, training instability, and low-quality output in cases of datasets with high intra-class variation. Furthermore, most GANs often converge in larger iterations, resulting in poor iteration efficacy in training procedures. While Diffusion-GAN has shown potential in generating realistic samples, it has a critical limitation in generating class-conditional samples. To overcome these limitations, we propose a novel approach for class-conditional image generation using GANs called DuDGAN, which incorporates a dual diffusion-based noise injection process. DuDGAN consists of three unique networks: a discriminator, a generator, and a classifier. During the training process, Gaussian-mixture noises are injected into the two noise-aware networks, the discriminator and the classifier, in distinct ways. This noisy data helps to prevent overfitting by gradually introducing more challenging tasks, leading to improved model performance. As a result, DuDGAN outperforms state-of-the-art conditional GAN models for image generation in terms of performance. We evaluated DuDGAN using the AFHQ, Food-101, CIFAR-10, and BAAT datasets and observed superior results across metrics such as FID, KID, Precision, and Recall score compared with comparison models; FID decreases 12.9% and 5.1% on average for AFHQ and CIFAR-10, respectively, highlighting the effectiveness of the proposed approach.
Keywords