EURASIP Journal on Advances in Signal Processing (Jun 2023)
WDIG: a wavelet domain image generation framework based on frequency domain optimization
Abstract
Abstract In the end-to-end image generation task, the spatial domain of pixel space cannot explicitly separate the low-frequency general information such as texture and color from the high-frequency detail information such as structure and identity. The loss function calculated in the spatial domain fails to effectively constrain the maintenance of detail information, and the generated image quality is insufficient. In this paper, a wavelet domain image generation (WDIG) framework is proposed to preserve the frequency information of images, in which the loss functions are constructed in the pixel space and wavelet space. In the pixel space, the low-frequency and high-frequency characteristic information of the signal are obtained by setting the appropriate Gaussian kernel and adopting the Gaussian fuzzy method. The loss function of $$\ell_{1}$$ ℓ 1 norm spatial domain is constructed for the low-frequency and high-frequency characteristic information. In the wavelet space, the corresponding channel sub-band coefficients are obtained by wavelet transform, and the image is explicitly separated into high-frequency information and low-frequency information. The $$\ell_{1}$$ ℓ 1 norm frequency domain loss function is constructed respectively for the sub-band coefficients. The WDIG can constrain model training more accurately and optimize model more precisely, so as to better maintain the details and quality of generated image. The WDIG framework is evaluated in the image generation applications including style transfer, image translation and Generative Adversarial Nets (GAN) Inversion. Experimental results show that the WDIG framework can effectively retain the details of images and generate more realistic images, and improve the image quality of the above applications in image generation.
Keywords