Skip to main content

DOT-VAE: Disentangling One Factor at a Time

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2022 (ICANN 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13529))

Included in the following conference series:

  • 2403 Accesses

Abstract

As we enter the era of machine learning characterized by an overabundance of data, discovery, organization, and interpretation of the data in an unsupervised manner becomes a critical need. One promising approach to this endeavour is the problem of Disentanglement, which aims at learning the underlying generative latent factors, called the factors of variation, of the data and encoding them in disjoint latent representations. Recent advances have made efforts to solve this problem for synthetic datasets generated by a fixed set of independent factors of variation. Here, we propose to extend this to real-world datasets with a countable number of factors of variations. We propose a novel framework which augments the latent space of a Variational Autoencoders with a disentangled space and is trained using a Wake-Sleep-inspired two-step algorithm for unsupervised disentanglement. Our network learns to disentangle interpretable, independent factors from the data “one at a time”, and encode it in different dimensions of the disentangled latent space, while making no prior assumptions about the number of factors or their joint distribution. We demonstrate its quantitative and qualitative effectiveness by evaluating the latent representations learned on two synthetic benchmark datasets; DSprites and 3DShapes and on a real datasets CelebA.

Supported by University of Maryland.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Code available at https://github.com/DOTFactor/DOTFactor.

References

  1. Bengio, Y.: Deep learning of representations: looking forward. In: Dediu, A.-H., Martín-Vide, C., Mitkov, R., Truthe, B. (eds.) SLSP 2013. LNCS (LNAI), vol. 7978, pp. 1–37. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39593-2_1

    Chapter  Google Scholar 

  2. Burgess, C.P., et al.: Understanding disentangling in \(\beta \)-VAE. arXiv preprint arXiv:1804.03599 (2018)

  3. Chen, R.T., Li, X., Grosse, R.B., Duvenaud, D.K.: Isolating sources of disentanglement in variational autoencoders. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  4. Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in Neural Information Processing Systems, Chicago, vol. 29 (2016)

    Google Scholar 

  5. Dupont, E.: Learning disentangled joint continuous and discrete representations. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  6. Hu, Z., Yang, Z., Liang, X., Salakhutdinov, R., Xing, E.P.: Toward controlled generation of text. In: International Conference on Machine Learning, pp. 1587–1596. PMLR (2017)

    Google Scholar 

  7. Eastwood, C., Williams, C.K.: A framework for the quantitative evaluation of disentangled representations. In: International Conference on Learning Representations (2018)

    Google Scholar 

  8. Geirhos, R., et al.: Shortcut learning in deep neural networks. Nat. Mach. Intell. 2(11), 665–673 (2020)

    Article  Google Scholar 

  9. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27 (2014)

    Google Scholar 

  10. Higgins, I., et al.: beta-VAE: learning basic visual concepts with a constrained variational framework (2016)

    Google Scholar 

  11. Higgins, I., et al.: Towards a definition of disentangled representations. arXiv preprint arXiv:1812.02230 (2018)

  12. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  13. Jeong, Y., Song, H.O.: Learning discrete and continuous factors of data via alternating disentanglement. In: International Conference on Machine Learning, pp. 3091–3099. PMLR (2019)

    Google Scholar 

  14. Kim, H., Mnih, A.: Disentangling by factorising. In: International Conference on Machine Learning, pp. 2649–2658. PMLR (2018)

    Google Scholar 

  15. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)

  16. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, Chicago, vol. 25 (2012)

    Google Scholar 

  17. Kumar, A., Sattigeri, P., Balakrishnan, A.: Variational inference of disentangled latent concepts from unlabeled observations. arXiv preprint arXiv:1711.00848 (2017)

  18. Lee, W., Kim, D., Hong, S., Lee, H.: High-fidelity synthesis with disentangled representation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12371, pp. 157–174. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58574-7_10

    Chapter  Google Scholar 

  19. Lin, Z., Thekumparampil, K., Fanti, G., Oh, S.: InfoGAN-CR and modelcentrality: Self-supervised model training and selection for disentangling GANs. In: International Conference on Machine Learning, pp. 6127–6139. PMLR (2020)

    Google Scholar 

  20. Liu, B., Zhu, Y., Fu, Z., De Melo, G., Elgammal, A.: OOGAN: disentangling GAN with one-hot sampling and orthogonal regularization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 04, pp. 4836–4843 (2020)

    Google Scholar 

  21. Locatello, F., et al.: Challenging common assumptions in the unsupervised learning of disentangled representations. In: International Conference on Machine Learning, pp. 4114–4124. PMLR (2019)

    Google Scholar 

  22. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  23. Oord, A.V.D., et al.: WaveNet: a generative model for raw audio. arXiv preprint arXiv:1609.03499. Chicago (2016)

  24. Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: International Conference on Machine Learning, pp. 1278–1286. PMLR, Chicago (2014)

    Google Scholar 

  25. Ridgeway, K., Mozer, M.C.: Learning deep disentangled embeddings with the f-statistic loss. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  26. Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., Mooij, J.: On causal and anticausal learning. arXiv preprint arXiv:1206.6471 (2012)

  27. Suter, R., Miladinovic, D., Schölkopf, B., Bauer, S.: Robustly disentangled causal mechanisms: Validating deep representations for interventional robustness. In: International Conference on Machine Learning, pp. 6056–6065. PMLR (2019)

    Google Scholar 

  28. Zaidi, J., Boilard, J., Gagnon, G., Carbonneau, M.A.: Measuring disentanglement: a review of metrics. arXiv preprint arXiv:2012.09276. Chicago (2020)

  29. Zhu, X., Xu, C., Tao, D.: Learning disentangled representations with latent variation predictability. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 684–700. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_40

    Chapter  Google Scholar 

  30. Matthey, L., Higgins, I., Hassabis, D., Lerchner, A.: dSprites: disentanglement testing Sprites dataset (2017). github.com/deepmind/dsprites-dataset/

  31. Burgess, C., Kim, H.: 3D shapes dataset (2018). github.com/deepmind/3dshapes-dataset/

  32. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of International Conference on Computer Vision (ICCV) (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vaishnavi Patil .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Patil, V., Evanusa, M., JaJa, J. (2022). DOT-VAE: Disentangling One Factor at a Time. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13529. Springer, Cham. https://doi.org/10.1007/978-3-031-15919-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15919-0_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15918-3

  • Online ISBN: 978-3-031-15919-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics