IET Image Processing (Feb 2024)

HQ‐I2IT: Redesign the optimization scheme to improve image quality in CycleGAN‐based image translation systems

  • Yipeng Zhang,
  • Bingliang Hu,
  • Yingying Huang,
  • Chi Gao,
  • Jianfu Yin,
  • Quang Wang

DOI
https://doi.org/10.1049/ipr2.12965
Journal volume & issue
Vol. 18, no. 2
pp. 507 – 522

Abstract

Read online

Abstract The image‐to‐image translation (I2IT) task aims to transform images from the source domain into the specified target domain. State‐of‐the‐art CycleGAN‐based translation algorithms typically use cycle consistency loss and latent regression loss to constrain translation. In this work, it is demonstrated that the model parameters constrained by the cycle consistency loss and the latent regression loss are equivalent to optimizing the medians of the data distribution and the generative distribution. In addition, there is a style bias in the translation. This bias interacts between the generator and the style encoder and visually exhibits translation errors, e.g. the style of the generated image is not equal to the style of the reference image. To address these issues, a new I2IT model termed high‐quality‐I2IT (HQ‐I2IT) is proposed. The optimization scheme is redesigned to prevent the model from optimizing the median of the data distribution. In addition, by separating the optimization of the generator and the latent code estimator, the redesigned model avoids error interactions and gradually corrects errors during training, thereby avoiding learning the median of the generated distribution. The experimental results demonstrate that the visual quality of the images produced by HQ‐I2IT is significantly improved without changing the generator structure, especially when guided by the reference images. Specifically, the Fréchet inception distance on the AFHQ and CelebA‐HQ datasets are reduced from 19.8 to 10.2 and from 23.8 to 17.0, respectively.

Keywords