IEEE Access (Jan 2021)

Disentangled Representation Learning in Real-World Image Datasets via Image Segmentation Prior

  • Nao Nakagawa,
  • Ren Togo,
  • Takahiro Ogawa,
  • Miki Haseyama

DOI
https://doi.org/10.1109/ACCESS.2021.3101229
Journal volume & issue
Vol. 9
pp. 110880 – 110888

Abstract

Read online

We propose a novel method that can learn easy-to-interpret latent representations in real-world image datasets using a VAE-based model by splitting an image into several disjoint regions. Our method performs object-wise disentanglement by exploiting image segmentation and alpha compositing. With remarkable results obtained by unsupervised disentanglement methods for toy datasets, recent studies have tackled challenging disentanglement for real-world image datasets. However, these methods involve deviations from the standard VAE architecture, which has favorable disentanglement properties. Thus, for disentanglement in images of real-world image datasets with preservation of the VAE backbone, we designed an encoder and a decoder that embed an image into disjoint sets of latent variables corresponding to objects. The encoder includes a pre-trained image segmentation network, which allows our model to focus only on representation learning while adopting image segmentation as an inductive bias. Evaluations using real-world image datasets, CelebA and Stanford Cars, showed that our method achieves improved disentanglement and transferability.

Keywords