Increasing the Generalisaton Capacity of Conditional VAEs

Klushyn, Alexej; Chen, Nutan; Cseke, Botond; Bayer, Justin; van der Smagt, Patrick

doi:10.1007/978-3-030-30484-3_61

Alexej Klushyn¹²,
Nutan Chen¹²,
Botond Cseke¹²,
Justin Bayer¹² &
…
Patrick van der Smagt¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11728))

Included in the following conference series:

International Conference on Artificial Neural Networks

3927 Accesses

Abstract

We address the problem of one-to-many mappings in supervised learning, where a single instance has many different solutions of possibly equal cost. The framework of conditional variational autoencoders describes a class of methods to tackle such structured-prediction tasks by means of latent variables. We propose to incentivise informative latent representations for increasing the generalisation capacity of conditional variational autoencoders. To this end, we modify the latent variable model by defining the likelihood as a function of the latent variable only and introduce an expressive multimodal prior to enable the model for capturing semantically meaningful features of the data. To validate our approach, we train our model on the Cornell Robot Grasping dataset, and modified versions of MNIST and Fashion-MNIST obtaining results that show a significantly higher generalisation capability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alemi, A.A., Poole, B., Fischer, I., Dillon, J.V., Saurous, R.A., Murphy, K.: Fixing a broken ELBO. In: ICML (2018)
Google Scholar
Bishop, C.M., Svens’ en, M., Williams, C.K.I.: Magnification factors for the SOM and GTM algorithms. In: Proceedings Workshop on Self-Organizing Maps (1997)
Google Scholar
Bowman, S.R., Vilnis, L., Vinyals, O., Dai, A.M., Jozefowicz, R., Bengio, S.: Generating sentences from a continuous space. In: CoNLL (2016)
Google Scholar
Chen, X., et al.: Variational Lossy Autoencoder. CoRR (2016)
Google Scholar
Higgins, I., et al.: beta-VAE: Learning basic visual concepts with a constrained variational framework. In: ICLR (2017)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. CoRR (2013)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NeurIPS (2012)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE (1998)
Google Scholar
Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Int. J. Robot. Res. (2015)
Google Scholar
Nalisnick, E., Smyth, P.: Stick-breaking variational autoencoders. In: ICLR (2017)
Google Scholar
Pinto, L., Gupta, A.: Supersizing self-supervision: learning to grasp from 50k tries and 700 robot hours. In: ICRA (2016)
Google Scholar
Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: ICML (2014)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Google Scholar
Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: NeurIPS (2015)
Google Scholar
Sønderby, C.K., Raiko, T., Maaløe, L., Sønderby, S.K., Winther, O.: Ladder variational autoencoders. In: NeurIPS (2016)
Google Scholar
Tang, Y., Salakhutdinov, R.R.: Learning Stochastic Feedforward Neural Networks. In: NeurIPS (2013)
Google Scholar
Tomczak, J., Welling, M.: VAE with a VampPrior. In: AISTATS (2018)
Google Scholar
Veres, M., Moussa, M., Taylor, G.W.: Modeling grasp motor imagery through deep conditional generative models. IEEE Robot. Autom. Lett. 2, 757–764 (2017)
Article Google Scholar
Walker, J., Doersch, C., Gupta, A., Hebert, M.: An uncertain future: forecasting from static images using variational autoencoders. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 835–851. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_51
Chapter Google Scholar
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv:1708.07747 (2017)
Yan, X., Yang, J., Sohn, K., Lee, H.: Attribute2Image: conditional image generation from visual attributes. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 776–791. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_47
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Machine Learning Research Lab, Volkswagen Group, Munich, Germany
Alexej Klushyn, Nutan Chen, Botond Cseke, Justin Bayer & Patrick van der Smagt

Authors

Alexej Klushyn
View author publications
You can also search for this author in PubMed Google Scholar
Nutan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Botond Cseke
View author publications
You can also search for this author in PubMed Google Scholar
Justin Bayer
View author publications
You can also search for this author in PubMed Google Scholar
Patrick van der Smagt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexej Klushyn .

Editor information

Editors and Affiliations

Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Igor V. Tetko
Institute of Computer Science, Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Pavel Karpov
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Fabian Theis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Klushyn, A., Chen, N., Cseke, B., Bayer, J., van der Smagt, P. (2019). Increasing the Generalisaton Capacity of Conditional VAEs. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning. ICANN 2019. Lecture Notes in Computer Science(), vol 11728. Springer, Cham. https://doi.org/10.1007/978-3-030-30484-3_61

Download citation

DOI: https://doi.org/10.1007/978-3-030-30484-3_61
Published: 09 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30483-6
Online ISBN: 978-3-030-30484-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics