Learning Conditional Invariance Through Cycle Consistency

Samarin, Maxim; Nesterov, Vitali; Wieser, Mario; Wieczorek, Aleksander; Parbhoo, Sonali; Roth, Volker

doi:10.1007/978-3-030-92659-5_24

Maxim Samarin¹¹,
Vitali Nesterov¹¹,
Mario Wieser¹¹,
Aleksander Wieczorek¹¹,
Sonali Parbhoo¹² &
…
Volker Roth¹¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13024))

Included in the following conference series:

DAGM German Conference on Pattern Recognition

1610 Accesses
1 Citations

Abstract

Identifying meaningful and independent factors of variation in a dataset is a challenging learning task frequently addressed by means of deep latent variable models. This task can be viewed as learning symmetry transformations preserving the value of a chosen property along latent dimensions. However, existing approaches exhibit severe drawbacks in enforcing the invariance property in the latent space. We address these shortcomings with a novel approach to cycle consistency. Our method involves two separate latent subspaces for the target property and the remaining input information, respectively. In order to enforce invariance as well as sparsity in the latent space, we incorporate semantic knowledge by using cycle consistency constraints relying on property side information. The proposed method is based on the deep information bottleneck and, in contrast to other approaches, allows using continuous target properties and provides inherent model selection capabilities. We demonstrate on synthetic and molecular data that our approach identifies more meaningful factors which lead to sparser and more interpretable models with improved invariance properties.

M. Samarin and V. Nesterov—Both authors contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Implementation [1, 18]: https://github.com/bmda-unibas/CondInvarianceCC.

References

Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/. Software available from tensorflow.org
Achille, A., Soatto, S.: Information dropout: learning optimal representations through noisy computation. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 2897–2905 (2018)
Article Google Scholar
Ainsworth, S.K., Foti, N.J., Lee, A.K.C., Fox, E.B.: oi-VAE: output interpretable VAEs for nonlinear group factor analysis. In: Proceedings of the 35th International Conference on Machine Learning (2018)
Google Scholar
Alemi, A.A., Fischer, I., Dillon, J.V., Murphy, K.: Deep variational information bottleneck. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings. OpenReview.net (2017). https://openreview.net/forum?id=HyxQzBceg
Bouchacourt, D., Tomioka, R., Nowozin, S.: Multi-level variational autoencoder: learning disentangled representations from grouped observations. In: AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Chechik, G., Globerson, A., Tishby, N., Weiss, Y.: Information bottleneck for Gaussian variables. J. Mach. Learn. Res. 6, 165–188 (2005)
MathSciNet MATH Google Scholar
Chen, R.T., Li, X., Grosse, R., Duvenaud, D.: Isolating sources of disentanglement in variational autoencoders. arXiv preprint arXiv:1802.04942 (2018)
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. arXiv preprint arXiv:1606.03657 (2016)
Chicharro, D., Besserve, M., Panzeri, S.: Causal learning with sufficient statistics: an information bottleneck approach. arXiv preprint arXiv:2010.05375 (2020)
Creswell, A., Mohamied, Y., Sengupta, B., Bharath, A.A.: Adversarial information factorization (2018)
Google Scholar
Gómez-Bombarelli, R., et al.: Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4(2), 268–276 (2018)
Article Google Scholar
Hansen, K., et al.: Machine learning predictions of molecular properties: accurate many-body potentials and nonlocality in chemical space. J. Phys. Chem. Lett. 6(12), 2326–2331 (2015)
Article Google Scholar
Higgins, I., et al.: \(\beta \)-VAE: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations (2017)
Google Scholar
Jha, A.H., Anand, S., Singh, M., Veeravasarapu, V.S.R.: Disentangling factors of variation with cycle-consistent variational auto-encoders. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 829–845. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_49
Chapter Google Scholar
Keller, S.M., Samarin, M., Torres, F.A., Wieser, M., Roth, V.: Learning extremal representations with deep archetypal analysis. Int. J. Comput. Vision 129(4), 805–820 (2021)
Article MathSciNet Google Scholar
Keller, S.M., Samarin, M., Wieser, M., Roth, V.: Deep archetypal analysis. In: Fink, G.A., Frintrop, S., Jiang, X. (eds.) DAGM GCPR 2019. LNCS, vol. 11824, pp. 171–185. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33676-9_12
Chapter Google Scholar
Kim, H., Mnih, A.: Disentangling by factorising. In: International Conference on Machine Learning, pp. 2649–2658. PMLR (2018)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)
Google Scholar
Kingma, D.P., Mohamed, S., Rezende, D.J., Welling, M.: Semi-supervised learning with deep generative models. In: Advances in Neural Information Processing Systems, pp. 3581–3589 (2014)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Bengio, Y., LeCun, Y. (eds.) 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014). http://arxiv.org/abs/1312.6114
Klys, J., Snell, J., Zemel, R.: Learning latent subspaces in variational autoencoders. In: Advances in Neural Information Processing Systems (2018)
Google Scholar
Kusner, M.J., Paige, B., Hernández-Lobato, J.M.: Grammar variational autoencoder. In: International Conference on Machine Learning (2017)
Google Scholar
Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., Ranzato, M.: Fader networks: manipulating images by sliding attributes. In: Advances in Neural Information Processing Systems (2017)
Google Scholar
Lin, Z., Thekumparampil, K., Fanti, G., Oh, S.: InfoGAN-CR and modelcentrality: self-supervised model training and selection for disentangling GANs. In: International Conference on Machine Learning, pp. 6127–6139. PMLR (2020)
Google Scholar
Locatello, F., et al.: Challenging common assumptions in the unsupervised learning of disentangled representations. In: International Conference on Machine Learning, pp. 4114–4124. PMLR (2019)
Google Scholar
Louizos, C., Shalit, U., Mooij, J.M., Sontag, D., Zemel, R., Welling, M.: Causal effect inference with deep latent-variable models. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 6446–6456. Curran Associates, Inc. (2017)
Google Scholar
Louizos, C., Swersky, K., Li, Y., Welling, M., Zemel, R.S.: The variational fair autoencoder. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016, Conference Track Proceedings (2016). http://arxiv.org/abs/1511.00830
Nesterov, V., Wieser, M., Roth, V.: 3DMolNet: a generative network for molecular structures (2020)
Google Scholar
Parbhoo, S., Wieser, M., Roth, V., Doshi-Velez, F.: Transfer learning from well-curated to less-resourced populations with HIV. In: Proceedings of the 5th Machine Learning for Healthcare Conference (2020)
Google Scholar
Parbhoo, S., Wieser, M., Wieczorek, A., Roth, V.: Information bottleneck for estimating treatment effects with systematically missing covariates. Entropy 22(4), 389 (2020)
Article MathSciNet Google Scholar
Ramakrishnan, R., Dral, P.O., Rupp, M., Von Lilienfeld, O.A.: Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1(1), 1–7 (2014)
Article Google Scholar
Raman, S., Fuchs, T.J., Wild, P.J., Dahl, E., Roth, V.: The Bayesian group-lasso for analyzing contingency tables. In: Proceedings of the 26th Annual International Conference on Machine Learning (2009)
Google Scholar
Rey, M., Roth, V., Fuchs, T.: Sparse meta-gaussian information bottleneck. In: International Conference on Machine Learning, pp. 910–918. PMLR (2014)
Google Scholar
Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: International Conference on Machine Learning (2014)
Google Scholar
Robert, T., Thome, N., Cord, M.: DualDis: dual-branch disentangling with adversarial learning (2019)
Google Scholar
Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: A sparse-group lasso. J. Comput. Graph. Stat. 22(2), 231–245 (2013)
Article MathSciNet Google Scholar
Song, J., Ermon, S.: Understanding the limitations of variational mutual information estimators. arXiv preprint arXiv:1910.06222 (2019)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. (Ser. B) 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
Tishby, N., Pereira, F.C., Bialek, W.: The information bottleneck method. In: Allerton Conference on Communication, Control and Computing (1999)
Google Scholar
Wieczorek, A., Roth, V.: Causal compression. arXiv preprint arXiv:1611.00261 (2016)
Wieczorek, A., Roth, V.: On the difference between the information bottleneck and the deep information bottleneck. Entropy 22(2), 131 (2020)
Article MathSciNet Google Scholar
Wieczorek, A., Wieser, M., Murezzan, D., Roth, V.: Learning sparse latent representations with the deep copula information bottleneck. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, 30 April–3 May 2018, Conference Track Proceedings. OpenReview.net (2018). https://openreview.net/forum?id=Hk0wHx-RW
Wieser, M.: Learning invariant representations for deep latent variable models. Ph.D. thesis, University of Basel (2020)
Google Scholar
Wieser, M., Parbhoo, S., Wieczorek, A., Roth, V.: Inverse learning of symmetries. In: Advances in Neural Information Processing Systems (2020)
Google Scholar
Wu, M., Hughes, M.C., Parbhoo, S., Zazzi, M., Roth, V., Doshi-Velez, F.: Beyond sparsity: tree regularization of deep models for interpretability (2017)
Google Scholar
Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: International Conference on Computer Vision (2017)
Google Scholar

Download references

Acknowledgements

This research was supported by the Swiss National Science Foundation through projects No. 167333 within the National Research Programme 75 “Big Data” (M.S.), No. P2BSP2 184359 (S.P.) and the NCCR MARVEL (V.N., M.W., A.W.). Furthermore, the authors would like to thank the anonymous reviewers for their valuable comments and suggestions.

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, University of Basel, Spiegelgasse 1, 4051, Basel, Switzerland
Maxim Samarin, Vitali Nesterov, Mario Wieser, Aleksander Wieczorek & Volker Roth
Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, 150 Western Avenue, Boston, MA, 02134, USA
Sonali Parbhoo

Authors

Maxim Samarin
View author publications
You can also search for this author in PubMed Google Scholar
Vitali Nesterov
View author publications
You can also search for this author in PubMed Google Scholar
Mario Wieser
View author publications
You can also search for this author in PubMed Google Scholar
Aleksander Wieczorek
View author publications
You can also search for this author in PubMed Google Scholar
Sonali Parbhoo
View author publications
You can also search for this author in PubMed Google Scholar
Volker Roth
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maxim Samarin .

Editor information

Editors and Affiliations

Fraunhofer IAIS, Sankt Augustin, Germany
Christian Bauckhage
University of Bonn, Bonn, Germany
Juergen Gall
University of Illinois at Urbana-Champaign, Urbana, IL, USA
Alexander Schwing

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 686 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Samarin, M., Nesterov, V., Wieser, M., Wieczorek, A., Parbhoo, S., Roth, V. (2021). Learning Conditional Invariance Through Cycle Consistency. In: Bauckhage, C., Gall, J., Schwing, A. (eds) Pattern Recognition. DAGM GCPR 2021. Lecture Notes in Computer Science(), vol 13024. Springer, Cham. https://doi.org/10.1007/978-3-030-92659-5_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-92659-5_24
Published: 13 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92658-8
Online ISBN: 978-3-030-92659-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics