Auxiliary Guided Autoregressive Variational Autoencoders

  • Thomas LucasEmail author
  • Jakob Verbeek
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11051)


Generative modeling of high-dimensional data is a key problem in machine learning. Successful approaches include latent variable models and autoregressive models. The complementary strengths of these approaches, to model global and local image statistics respectively, suggest hybrid models that encode global image structure into latent variables while autoregressively modeling low level detail. Previous approaches to such hybrid models restrict the capacity of the autoregressive decoder to prevent degenerate models that ignore the latent variables and only rely on autoregressive modeling. Our contribution is a training procedure relying on an auxiliary loss function that controls which information is captured by the latent variables and what is left to the autoregressive decoder. Our approach can leverage arbitrarily powerful autoregressive decoders, achieves state-of-the art quantitative performance among models with latent variables, and generates qualitatively convincing samples.



This work has been partially supported by the grant ANR-16-CE23-0006 “Deep in France” and LabEx PERSYVAL-Lab (ANR-11-LABX-0025-01).


  1. 1.
    Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: ICML (2017)Google Scholar
  2. 2.
    Bachman, P.: An architecture for deep, hierarchical generative models. In: NIPS (2016)Google Scholar
  3. 3.
    Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. PAMI 35(8), 1798–1828 (2013)CrossRefGoogle Scholar
  4. 4.
    Burda, Y., Salakhutdinov, R., Grosse, R.: Importance weighted autoencoders. In: ICLR (2016)Google Scholar
  5. 5.
    Chen, X., et al.: Variational lossy autoencoder. In: ICLR (2017)Google Scholar
  6. 6.
    Deshpande, A., Lu, J., Yeh, M.C., Chong, M., Forsyth, D.: Learning diverse image colorization. In: CVPR (2017)Google Scholar
  7. 7.
    Dinh, L., Krueger, D., Bengio, Y.: NICE: non-linear independent components estimation. In: ICLR (2015)Google Scholar
  8. 8.
    Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real NVP. In: ICLR (2017)Google Scholar
  9. 9.
    Germain, M., Gregor, K., Murray, I., Larochelle, H.: MADE: masked autoencoder for distribution estimation. In: ICML (2015)Google Scholar
  10. 10.
    Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)Google Scholar
  11. 11.
    Gregor, K., Besse, F., Rezende, D., Danihelka, I., Wierstra, D.: Towards conceptual compression. In: NIPS (2016)Google Scholar
  12. 12.
    Gulrajani, I., et al.: PixelVAE: a latent variable model for natural images. In: ICLR (2017)Google Scholar
  13. 13.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)Google Scholar
  14. 14.
    Kingma, D., Rezende, D., Mohamed, S., Welling, M.: Semi-supervised learning with deep generative models. In: NIPS (2014)Google Scholar
  15. 15.
    Kingma, D., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., Welling, M.: Improved variational inference with inverse autoregressive flow. In: NIPS (2016)Google Scholar
  16. 16.
    Kingma, D., Welling, M.: Auto-encoding variational Bayes. In: ICLR (2014)Google Scholar
  17. 17.
    Kolesnikov, A., Lampert, C.: PixelCNN models with auxiliary variables for natural image modeling. In: ICML (2017)Google Scholar
  18. 18.
    Krizhevsky, A.: Learning multiple layers of features from tiny images. Master’s thesis, University of Toronto (2009)Google Scholar
  19. 19.
    Larochelle, H., Murray, I.: The neural autoregressive distribution estimator (2011)Google Scholar
  20. 20.
    van den Oord, A., Kalchbrenner, N., Vinyals, O., Espeholt, L., Graves, A., Kavukcuoglu, K.: Conditional image generation with PixelCNN decoders. In: NIPS (2016)Google Scholar
  21. 21.
    Oord, A.v.d., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. In: ICML (2016)Google Scholar
  22. 22.
    Rasmus, A., Berglund, M., Honkala, M., Valpola, H., Raiko, T.: Semi-supervised learning with ladder networks. In: NIPS (2015)Google Scholar
  23. 23.
    Reed, S., et al.: Parallel multiscale autoregressive density estimation. In: ICML (2017)Google Scholar
  24. 24.
    Rezende, D., Mohamed, S., Wierstra, D.: Stochastic back propagation and approximate inference in deep generative models. In: ICML (2014)Google Scholar
  25. 25.
    Salimans, T., Karpathy, A., Chen, X., Kingma, D.: Pixelcnn++: improving the pixel CNN with discretized logistic mixture likelihood and other modifications. In: ICLR (2017)Google Scholar
  26. 26.
    Yan, X., Yang, J., Sohn, K., Lee, H.: Attribute2image: conditional image generation from visual attributes. In: ECCV (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Université. Grenoble Alpes, Inria, CNRS, Grenoble INP, LJKGrenobleFrance

Personalised recommendations