Skip to main content

Learning Conditional Invariance Through Cycle Consistency

  • Conference paper
  • First Online:
Pattern Recognition (DAGM GCPR 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13024))

Included in the following conference series:

Abstract

Identifying meaningful and independent factors of variation in a dataset is a challenging learning task frequently addressed by means of deep latent variable models. This task can be viewed as learning symmetry transformations preserving the value of a chosen property along latent dimensions. However, existing approaches exhibit severe drawbacks in enforcing the invariance property in the latent space. We address these shortcomings with a novel approach to cycle consistency. Our method involves two separate latent subspaces for the target property and the remaining input information, respectively. In order to enforce invariance as well as sparsity in the latent space, we incorporate semantic knowledge by using cycle consistency constraints relying on property side information. The proposed method is based on the deep information bottleneck and, in contrast to other approaches, allows using continuous target properties and provides inherent model selection capabilities. We demonstrate on synthetic and molecular data that our approach identifies more meaningful factors which lead to sparser and more interpretable models with improved invariance properties.

M. Samarin and V. Nesterov—Both authors contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Implementation [1, 18]: https://github.com/bmda-unibas/CondInvarianceCC.

References

  1. Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/. Software available from tensorflow.org

  2. Achille, A., Soatto, S.: Information dropout: learning optimal representations through noisy computation. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 2897–2905 (2018)

    Article  Google Scholar 

  3. Ainsworth, S.K., Foti, N.J., Lee, A.K.C., Fox, E.B.: oi-VAE: output interpretable VAEs for nonlinear group factor analysis. In: Proceedings of the 35th International Conference on Machine Learning (2018)

    Google Scholar 

  4. Alemi, A.A., Fischer, I., Dillon, J.V., Murphy, K.: Deep variational information bottleneck. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings. OpenReview.net (2017). https://openreview.net/forum?id=HyxQzBceg

  5. Bouchacourt, D., Tomioka, R., Nowozin, S.: Multi-level variational autoencoder: learning disentangled representations from grouped observations. In: AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  6. Chechik, G., Globerson, A., Tishby, N., Weiss, Y.: Information bottleneck for Gaussian variables. J. Mach. Learn. Res. 6, 165–188 (2005)

    MathSciNet  MATH  Google Scholar 

  7. Chen, R.T., Li, X., Grosse, R., Duvenaud, D.: Isolating sources of disentanglement in variational autoencoders. arXiv preprint arXiv:1802.04942 (2018)

  8. Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. arXiv preprint arXiv:1606.03657 (2016)

  9. Chicharro, D., Besserve, M., Panzeri, S.: Causal learning with sufficient statistics: an information bottleneck approach. arXiv preprint arXiv:2010.05375 (2020)

  10. Creswell, A., Mohamied, Y., Sengupta, B., Bharath, A.A.: Adversarial information factorization (2018)

    Google Scholar 

  11. Gómez-Bombarelli, R., et al.: Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4(2), 268–276 (2018)

    Article  Google Scholar 

  12. Hansen, K., et al.: Machine learning predictions of molecular properties: accurate many-body potentials and nonlocality in chemical space. J. Phys. Chem. Lett. 6(12), 2326–2331 (2015)

    Article  Google Scholar 

  13. Higgins, I., et al.: \(\beta \)-VAE: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations (2017)

    Google Scholar 

  14. Jha, A.H., Anand, S., Singh, M., Veeravasarapu, V.S.R.: Disentangling factors of variation with cycle-consistent variational auto-encoders. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 829–845. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_49

    Chapter  Google Scholar 

  15. Keller, S.M., Samarin, M., Torres, F.A., Wieser, M., Roth, V.: Learning extremal representations with deep archetypal analysis. Int. J. Comput. Vision 129(4), 805–820 (2021)

    Article  MathSciNet  Google Scholar 

  16. Keller, S.M., Samarin, M., Wieser, M., Roth, V.: Deep archetypal analysis. In: Fink, G.A., Frintrop, S., Jiang, X. (eds.) DAGM GCPR 2019. LNCS, vol. 11824, pp. 171–185. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33676-9_12

    Chapter  Google Scholar 

  17. Kim, H., Mnih, A.: Disentangling by factorising. In: International Conference on Machine Learning, pp. 2649–2658. PMLR (2018)

    Google Scholar 

  18. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)

    Google Scholar 

  19. Kingma, D.P., Mohamed, S., Rezende, D.J., Welling, M.: Semi-supervised learning with deep generative models. In: Advances in Neural Information Processing Systems, pp. 3581–3589 (2014)

    Google Scholar 

  20. Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Bengio, Y., LeCun, Y. (eds.) 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014). http://arxiv.org/abs/1312.6114

  21. Klys, J., Snell, J., Zemel, R.: Learning latent subspaces in variational autoencoders. In: Advances in Neural Information Processing Systems (2018)

    Google Scholar 

  22. Kusner, M.J., Paige, B., Hernández-Lobato, J.M.: Grammar variational autoencoder. In: International Conference on Machine Learning (2017)

    Google Scholar 

  23. Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., Ranzato, M.: Fader networks: manipulating images by sliding attributes. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  24. Lin, Z., Thekumparampil, K., Fanti, G., Oh, S.: InfoGAN-CR and modelcentrality: self-supervised model training and selection for disentangling GANs. In: International Conference on Machine Learning, pp. 6127–6139. PMLR (2020)

    Google Scholar 

  25. Locatello, F., et al.: Challenging common assumptions in the unsupervised learning of disentangled representations. In: International Conference on Machine Learning, pp. 4114–4124. PMLR (2019)

    Google Scholar 

  26. Louizos, C., Shalit, U., Mooij, J.M., Sontag, D., Zemel, R., Welling, M.: Causal effect inference with deep latent-variable models. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 6446–6456. Curran Associates, Inc. (2017)

    Google Scholar 

  27. Louizos, C., Swersky, K., Li, Y., Welling, M., Zemel, R.S.: The variational fair autoencoder. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016, Conference Track Proceedings (2016). http://arxiv.org/abs/1511.00830

  28. Nesterov, V., Wieser, M., Roth, V.: 3DMolNet: a generative network for molecular structures (2020)

    Google Scholar 

  29. Parbhoo, S., Wieser, M., Roth, V., Doshi-Velez, F.: Transfer learning from well-curated to less-resourced populations with HIV. In: Proceedings of the 5th Machine Learning for Healthcare Conference (2020)

    Google Scholar 

  30. Parbhoo, S., Wieser, M., Wieczorek, A., Roth, V.: Information bottleneck for estimating treatment effects with systematically missing covariates. Entropy 22(4), 389 (2020)

    Article  MathSciNet  Google Scholar 

  31. Ramakrishnan, R., Dral, P.O., Rupp, M., Von Lilienfeld, O.A.: Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1(1), 1–7 (2014)

    Article  Google Scholar 

  32. Raman, S., Fuchs, T.J., Wild, P.J., Dahl, E., Roth, V.: The Bayesian group-lasso for analyzing contingency tables. In: Proceedings of the 26th Annual International Conference on Machine Learning (2009)

    Google Scholar 

  33. Rey, M., Roth, V., Fuchs, T.: Sparse meta-gaussian information bottleneck. In: International Conference on Machine Learning, pp. 910–918. PMLR (2014)

    Google Scholar 

  34. Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: International Conference on Machine Learning (2014)

    Google Scholar 

  35. Robert, T., Thome, N., Cord, M.: DualDis: dual-branch disentangling with adversarial learning (2019)

    Google Scholar 

  36. Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: A sparse-group lasso. J. Comput. Graph. Stat. 22(2), 231–245 (2013)

    Article  MathSciNet  Google Scholar 

  37. Song, J., Ermon, S.: Understanding the limitations of variational mutual information estimators. arXiv preprint arXiv:1910.06222 (2019)

  38. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. (Ser. B) 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  39. Tishby, N., Pereira, F.C., Bialek, W.: The information bottleneck method. In: Allerton Conference on Communication, Control and Computing (1999)

    Google Scholar 

  40. Wieczorek, A., Roth, V.: Causal compression. arXiv preprint arXiv:1611.00261 (2016)

  41. Wieczorek, A., Roth, V.: On the difference between the information bottleneck and the deep information bottleneck. Entropy 22(2), 131 (2020)

    Article  MathSciNet  Google Scholar 

  42. Wieczorek, A., Wieser, M., Murezzan, D., Roth, V.: Learning sparse latent representations with the deep copula information bottleneck. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, 30 April–3 May 2018, Conference Track Proceedings. OpenReview.net (2018). https://openreview.net/forum?id=Hk0wHx-RW

  43. Wieser, M.: Learning invariant representations for deep latent variable models. Ph.D. thesis, University of Basel (2020)

    Google Scholar 

  44. Wieser, M., Parbhoo, S., Wieczorek, A., Roth, V.: Inverse learning of symmetries. In: Advances in Neural Information Processing Systems (2020)

    Google Scholar 

  45. Wu, M., Hughes, M.C., Parbhoo, S., Zazzi, M., Roth, V., Doshi-Velez, F.: Beyond sparsity: tree regularization of deep models for interpretability (2017)

    Google Scholar 

  46. Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: International Conference on Computer Vision (2017)

    Google Scholar 

Download references

Acknowledgements

This research was supported by the Swiss National Science Foundation through projects No. 167333 within the National Research Programme 75 “Big Data” (M.S.), No. P2BSP2 184359 (S.P.) and the NCCR MARVEL (V.N., M.W., A.W.). Furthermore, the authors would like to thank the anonymous reviewers for their valuable comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maxim Samarin .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 686 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Samarin, M., Nesterov, V., Wieser, M., Wieczorek, A., Parbhoo, S., Roth, V. (2021). Learning Conditional Invariance Through Cycle Consistency. In: Bauckhage, C., Gall, J., Schwing, A. (eds) Pattern Recognition. DAGM GCPR 2021. Lecture Notes in Computer Science(), vol 13024. Springer, Cham. https://doi.org/10.1007/978-3-030-92659-5_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-92659-5_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-92658-8

  • Online ISBN: 978-3-030-92659-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics