Learning Disentangled Representations with Latent Variation Predictability

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12355)


Latent traversal is a popular approach to visualize the disentangled latent representations. Given a bunch of variations in a single unit of the latent representation, it is expected that there is a change in a single factor of variation of the data while others are fixed. However, this impressive experimental observation is rarely explicitly encoded in the objective function of learning disentangled representations. This paper defines the variation predictability of latent disentangled representations. Given image pairs generated by latent codes varying in a single dimension, this varied dimension could be closely correlated with these image pairs if the representation is well disentangled. Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs. We further develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations. The proposed variation predictability is a general constraint that is applicable to the VAE and GAN frameworks for boosting disentanglement of latent representations. Experiments show that the proposed variation predictability correlates well with existing ground-truth-required metrics and the proposed algorithm is effective for disentanglement learning.



This work was supported by Australian Research Council Projects FL-170100117, DP-180103424 and DE180101438. We thank Jiaxian Guo and Youjian Zhang for their constructive discussions.

Supplementary material (57.9 mb)
Supplementary material 1 (zip 59265 KB)


  1. 1.
    Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: ICML (2017)Google Scholar
  2. 2.
    Aubry, M., Maturana, D., Efros, A.A., Russell, B.C., Sivic, J.: Seeing 3D chairs: exemplar part-based 2D–3D alignment using a large dataset of cad models. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3762–3769 (2014)Google Scholar
  3. 3.
    Banerjee, A., Merugu, S., Dhillon, I.S., Ghosh, J.: Clustering with Bregman divergences. J. Mach. Learn. Res. 6, 1705–1749 (2004)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Bau, D., et al.: Gan dissection: visualizing and understanding generative adversarial networks. In: Proceedings of the International Conference on Learning Representations (ICLR) (2019)Google Scholar
  5. 5.
    Bengio, Y., Courville, A.C., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1798–1828 (2012)CrossRefGoogle Scholar
  6. 6.
    Berthelot, D., Schumm, T., Metz, L.: Began: boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717 (2017)
  7. 7.
    Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis (2019)Google Scholar
  8. 8.
    Burgess, C.P., et al.: Understanding disentangling in beta-VAE. ArXiv abs/1804.03599 (2018)Google Scholar
  9. 9.
    Chen, R.T.Q., Li, X., Grosse, R., Duvenaud, D.: Isolating sources of disentanglement in variational autoencoders. In: Advances in Neural Information Processing Systems (2018)Google Scholar
  10. 10.
    Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. In: NIPS (2016)Google Scholar
  11. 11.
    Creager, E., et al.: Flexibly fair representation learning by disentanglement. In: ICML (2019)Google Scholar
  12. 12.
    Donahue, C., Lipton, Z.C., Balsubramani, A., McAuley, J.J.: Semantically decomposing the latent spaces of generative adversarial networks (2018)Google Scholar
  13. 13.
    Dupont, E.: Learning disentangled joint continuous and discrete representations. In: NeurIPS (2018)Google Scholar
  14. 14.
    Eastwood, C., Williams, C.K.I.: A framework for the quantitative evaluation of disentangled representations. In: ICLR (2018)Google Scholar
  15. 15.
    Goodfellow, I.J., et al.: Generative adversarial networks. In: NIPS (2014)Google Scholar
  16. 16.
    Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: NIPS (2017)Google Scholar
  17. 17.
    Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: NIPS (2017)Google Scholar
  18. 18.
    Higgins, I., et al.: Beta-VAE: learning basic visual concepts with a constrained variational framework. In: ICLR (2017)Google Scholar
  19. 19.
    Hoffman, M.D., Johnson, M.J.: ELBO surgery: yet another way to carve up the variational evidence lower bound. In: Workshop in Advances in Approximate Bayesian Inference, NIPS, vol. 1 (2016)Google Scholar
  20. 20.
    Jeong, Y., Song, H.O.: Learning discrete and continuous factors of data via alternating disentanglement. In: ICML (2019)Google Scholar
  21. 21.
    Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation (2018)Google Scholar
  22. 22.
    Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. IEEE Trans. Pattern Anal. Mach. Intell. (2020)Google Scholar
  23. 23.
    Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of styleGAN. ArXiv abs/1912.04958 (2019)Google Scholar
  24. 24.
    Kim, H., Mnih, A.: Disentangling by factorising. In: ICML (2018)Google Scholar
  25. 25.
    Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: ICLR (2013)Google Scholar
  26. 26.
    Kodali, N., Hays, J., Abernethy, J.D., Kira, Z.: On convergence and stability of GANs (2018)Google Scholar
  27. 27.
    Kumar, A., Sattigeri, P., Balakrishnan, A.: Variational inference of disentangled latent concepts from unlabeled observations. In: ICLR (2018)Google Scholar
  28. 28.
    Lin, Z., Thekumparampil, K.K., Fanti, G., Oh, S.: InfoGAN-CR and modelcentrality: self-supervised model training and selection for disentangling GANs. In: ICML (2020)Google Scholar
  29. 29.
    Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: IEEE International Conference on Computer Vision (ICCV), pp. 3730–3738 (2014)Google Scholar
  30. 30.
    Locatello, F., Abbati, G., Rainforth, T., Bauer, S., Schölkopf, B., Bachem, O.: On the fairness of disentangled representations. In: NeurIPS (2019)Google Scholar
  31. 31.
    Locatello, F., et al.: Challenging common assumptions in the unsupervised learning of disentangled representations. In: ICML (2019)Google Scholar
  32. 32.
    Makhzani, A., Frey, B.J.: PixelGAN autoencoders. In: NIPS (2017)Google Scholar
  33. 33.
    Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I.J.: Adversarial autoencoders. ArXiv abs/1511.05644 (2015)Google Scholar
  34. 34.
    Matthey, L., Higgins, I., Hassabis, D., Lerchner, A.: dSprites: disentanglement testing sprites dataset. (2017)
  35. 35.
    Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks (2018)Google Scholar
  36. 36.
    Nguyen, X., Wainwright, M.J., Jordan, M.I.: Estimating divergence functionals and the likelihood ratio by convex risk minimization. IEEE Trans. Inf. Theory 56, 5847–5861 (2010)MathSciNetCrossRefGoogle Scholar
  37. 37.
    Nguyen-Phuoc, T., Li, C., Theis, L., Richardt, C., Yang, Y.: HoloGAN: unsupervised learning of 3D representations from natural images. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 7587–7596 (2019)Google Scholar
  38. 38.
    Peng, X., Huang, Z., Sun, X., Saenko, K.: Domain agnostic learning with disentangled representations. In: ICML (2019)Google Scholar
  39. 39.
    Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: ICLR (2016)Google Scholar
  40. 40.
    Ridgeway, K., Mozer, M.C.: Learning deep disentangled embeddings with the f-statistic loss. In: NeurIPS (2018)Google Scholar
  41. 41.
    Shen, Y., Gu, J., Tang, X., Zhou, B.: Interpreting the latent space of GANs for semantic face editing. ArXiv abs/1907.10786 (2019)Google Scholar
  42. 42.
    van Steenkiste, S., Locatello, F., Schmidhuber, J., Bachem, O.: Are disentangled representations helpful for abstract visual reasoning? In: NIPS (2019)Google Scholar
  43. 43.
    Suter, R., Miladinovic, D., Schölkopf, B., Bauer, S.: Robustly disentangled causal mechanisms: Validating deep representations for interventional robustness. In: Proceedings of the 36th International Conference on Machine Learning (ICML). Proceedings of Machine Learning Research, vol. 97, pp. 6056–6065. PMLR, June 2019.
  44. 44.
    Tran, L., Yin, X., Liu, X.: Disentangled representation learning GAN for pose-invariant face recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1283–1292 (2017)Google Scholar
  45. 45.
    Wang, Y., Yao, Q., Kwok, J.T., Ni, L.M.: Generalizing from a few examples: a survey on few-shot learning (2019)Google Scholar
  46. 46.
    Yang, J., Dvornek, N.C., Zhang, F., Chapiro, J., Lin, M.D., Duncan, J.S.: Unsupervised domain adaptation via disentangled representations: application to cross-modality liver segmentation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11765, pp. 255–263. Springer, Cham (2019). Scholar
  47. 47.
    Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: ICML (2019)Google Scholar
  48. 48.
    Zhao, S., Song, J., Ermon, S.: Learning hierarchical features from generative models. In: ICML (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.UBTECH Sydney AI Centre, School of Computer Science, Faculty of EngineeringThe University of SydneyDarlingtonAustralia

Personalised recommendations