Abstract
Identifying meaningful and independent factors of variation in a dataset is a challenging learning task frequently addressed by means of deep latent variable models. This task can be viewed as learning symmetry transformations preserving the value of a chosen property along latent dimensions. However, existing approaches exhibit severe drawbacks in enforcing the invariance property in the latent space. We address these shortcomings with a novel approach to cycle consistency. Our method involves two separate latent subspaces for the target property and the remaining input information, respectively. In order to enforce invariance as well as sparsity in the latent space, we incorporate semantic knowledge by using cycle consistency constraints relying on property side information. The proposed method is based on the deep information bottleneck and, in contrast to other approaches, allows using continuous target properties and provides inherent model selection capabilities. We demonstrate on synthetic and molecular data that our approach identifies more meaningful factors which lead to sparser and more interpretable models with improved invariance properties.
M. Samarin and V. Nesterov—Both authors contributed equally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Implementation [1, 18]: https://github.com/bmda-unibas/CondInvarianceCC.
References
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/. Software available from tensorflow.org
Achille, A., Soatto, S.: Information dropout: learning optimal representations through noisy computation. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 2897–2905 (2018)
Ainsworth, S.K., Foti, N.J., Lee, A.K.C., Fox, E.B.: oi-VAE: output interpretable VAEs for nonlinear group factor analysis. In: Proceedings of the 35th International Conference on Machine Learning (2018)
Alemi, A.A., Fischer, I., Dillon, J.V., Murphy, K.: Deep variational information bottleneck. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings. OpenReview.net (2017). https://openreview.net/forum?id=HyxQzBceg
Bouchacourt, D., Tomioka, R., Nowozin, S.: Multi-level variational autoencoder: learning disentangled representations from grouped observations. In: AAAI Conference on Artificial Intelligence (2018)
Chechik, G., Globerson, A., Tishby, N., Weiss, Y.: Information bottleneck for Gaussian variables. J. Mach. Learn. Res. 6, 165–188 (2005)
Chen, R.T., Li, X., Grosse, R., Duvenaud, D.: Isolating sources of disentanglement in variational autoencoders. arXiv preprint arXiv:1802.04942 (2018)
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. arXiv preprint arXiv:1606.03657 (2016)
Chicharro, D., Besserve, M., Panzeri, S.: Causal learning with sufficient statistics: an information bottleneck approach. arXiv preprint arXiv:2010.05375 (2020)
Creswell, A., Mohamied, Y., Sengupta, B., Bharath, A.A.: Adversarial information factorization (2018)
Gómez-Bombarelli, R., et al.: Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4(2), 268–276 (2018)
Hansen, K., et al.: Machine learning predictions of molecular properties: accurate many-body potentials and nonlocality in chemical space. J. Phys. Chem. Lett. 6(12), 2326–2331 (2015)
Higgins, I., et al.: \(\beta \)-VAE: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations (2017)
Jha, A.H., Anand, S., Singh, M., Veeravasarapu, V.S.R.: Disentangling factors of variation with cycle-consistent variational auto-encoders. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 829–845. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_49
Keller, S.M., Samarin, M., Torres, F.A., Wieser, M., Roth, V.: Learning extremal representations with deep archetypal analysis. Int. J. Comput. Vision 129(4), 805–820 (2021)
Keller, S.M., Samarin, M., Wieser, M., Roth, V.: Deep archetypal analysis. In: Fink, G.A., Frintrop, S., Jiang, X. (eds.) DAGM GCPR 2019. LNCS, vol. 11824, pp. 171–185. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33676-9_12
Kim, H., Mnih, A.: Disentangling by factorising. In: International Conference on Machine Learning, pp. 2649–2658. PMLR (2018)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)
Kingma, D.P., Mohamed, S., Rezende, D.J., Welling, M.: Semi-supervised learning with deep generative models. In: Advances in Neural Information Processing Systems, pp. 3581–3589 (2014)
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Bengio, Y., LeCun, Y. (eds.) 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014). http://arxiv.org/abs/1312.6114
Klys, J., Snell, J., Zemel, R.: Learning latent subspaces in variational autoencoders. In: Advances in Neural Information Processing Systems (2018)
Kusner, M.J., Paige, B., Hernández-Lobato, J.M.: Grammar variational autoencoder. In: International Conference on Machine Learning (2017)
Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., Ranzato, M.: Fader networks: manipulating images by sliding attributes. In: Advances in Neural Information Processing Systems (2017)
Lin, Z., Thekumparampil, K., Fanti, G., Oh, S.: InfoGAN-CR and modelcentrality: self-supervised model training and selection for disentangling GANs. In: International Conference on Machine Learning, pp. 6127–6139. PMLR (2020)
Locatello, F., et al.: Challenging common assumptions in the unsupervised learning of disentangled representations. In: International Conference on Machine Learning, pp. 4114–4124. PMLR (2019)
Louizos, C., Shalit, U., Mooij, J.M., Sontag, D., Zemel, R., Welling, M.: Causal effect inference with deep latent-variable models. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 6446–6456. Curran Associates, Inc. (2017)
Louizos, C., Swersky, K., Li, Y., Welling, M., Zemel, R.S.: The variational fair autoencoder. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016, Conference Track Proceedings (2016). http://arxiv.org/abs/1511.00830
Nesterov, V., Wieser, M., Roth, V.: 3DMolNet: a generative network for molecular structures (2020)
Parbhoo, S., Wieser, M., Roth, V., Doshi-Velez, F.: Transfer learning from well-curated to less-resourced populations with HIV. In: Proceedings of the 5th Machine Learning for Healthcare Conference (2020)
Parbhoo, S., Wieser, M., Wieczorek, A., Roth, V.: Information bottleneck for estimating treatment effects with systematically missing covariates. Entropy 22(4), 389 (2020)
Ramakrishnan, R., Dral, P.O., Rupp, M., Von Lilienfeld, O.A.: Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1(1), 1–7 (2014)
Raman, S., Fuchs, T.J., Wild, P.J., Dahl, E., Roth, V.: The Bayesian group-lasso for analyzing contingency tables. In: Proceedings of the 26th Annual International Conference on Machine Learning (2009)
Rey, M., Roth, V., Fuchs, T.: Sparse meta-gaussian information bottleneck. In: International Conference on Machine Learning, pp. 910–918. PMLR (2014)
Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: International Conference on Machine Learning (2014)
Robert, T., Thome, N., Cord, M.: DualDis: dual-branch disentangling with adversarial learning (2019)
Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: A sparse-group lasso. J. Comput. Graph. Stat. 22(2), 231–245 (2013)
Song, J., Ermon, S.: Understanding the limitations of variational mutual information estimators. arXiv preprint arXiv:1910.06222 (2019)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. (Ser. B) 58(1), 267–288 (1996)
Tishby, N., Pereira, F.C., Bialek, W.: The information bottleneck method. In: Allerton Conference on Communication, Control and Computing (1999)
Wieczorek, A., Roth, V.: Causal compression. arXiv preprint arXiv:1611.00261 (2016)
Wieczorek, A., Roth, V.: On the difference between the information bottleneck and the deep information bottleneck. Entropy 22(2), 131 (2020)
Wieczorek, A., Wieser, M., Murezzan, D., Roth, V.: Learning sparse latent representations with the deep copula information bottleneck. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, 30 April–3 May 2018, Conference Track Proceedings. OpenReview.net (2018). https://openreview.net/forum?id=Hk0wHx-RW
Wieser, M.: Learning invariant representations for deep latent variable models. Ph.D. thesis, University of Basel (2020)
Wieser, M., Parbhoo, S., Wieczorek, A., Roth, V.: Inverse learning of symmetries. In: Advances in Neural Information Processing Systems (2020)
Wu, M., Hughes, M.C., Parbhoo, S., Zazzi, M., Roth, V., Doshi-Velez, F.: Beyond sparsity: tree regularization of deep models for interpretability (2017)
Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: International Conference on Computer Vision (2017)
Acknowledgements
This research was supported by the Swiss National Science Foundation through projects No. 167333 within the National Research Programme 75 “Big Data” (M.S.), No. P2BSP2 184359 (S.P.) and the NCCR MARVEL (V.N., M.W., A.W.). Furthermore, the authors would like to thank the anonymous reviewers for their valuable comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Samarin, M., Nesterov, V., Wieser, M., Wieczorek, A., Parbhoo, S., Roth, V. (2021). Learning Conditional Invariance Through Cycle Consistency. In: Bauckhage, C., Gall, J., Schwing, A. (eds) Pattern Recognition. DAGM GCPR 2021. Lecture Notes in Computer Science(), vol 13024. Springer, Cham. https://doi.org/10.1007/978-3-030-92659-5_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-92659-5_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92658-8
Online ISBN: 978-3-030-92659-5
eBook Packages: Computer ScienceComputer Science (R0)