Skip to main content

Direct Evolutionary Optimization of Variational Autoencoders with Binary Latents

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2022)

Abstract

Many types of data are generated at least partly by discrete causes. Deep generative models such as variational autoencoders (VAEs) with binary latents consequently became of interest. Because of discrete latents, standard VAE training is not possible, and the goal of previous approaches has therefore been to amend (i.e, typically anneal) discrete priors to allow for a training analogously to conventional VAEs. Here, we divert more strongly from conventional VAE optimization: We ask if the discrete nature of the latents can be fully maintained by applying a direct, discrete optimization for the encoding model. In doing so, we sidestep standard VAE mechanisms such as sampling approximation, reparameterization and amortization. Direct optimization of VAEs is enabled by a combination of evolutionary algorithms and truncated posteriors as variational distributions. Such a combination has recently been suggested, and we here for the first time investigate how it can be applied to a deep model. Concretely, we (A) tie the variational method into gradient ascent for network weights, and (B) show how the decoder is used for the optimization of variational parameters. Using image data, we observed the approach to result in much sparser codes compared to conventionally trained binary VAEs. Considering the for sparse codes prototypical application to image patches, we observed very competitive performance in tasks such as ‘zero-shot’ denoising and inpainting. The dense codes emerging from conventional VAE optimization, on the other hand, seem preferable on other data, e.g., collections of images of whole single objects (CIFAR etc.), but less preferable for image patches. More generally, the realization of a very different type of optimization for binary VAEs allows for investigating advantages and disadvantages of the training method itself. And we here observed a strong influence of the method on the learned encoding with significant impact on VAE performance for different tasks.

J. Drefs and E. Guiraud—Joint first authorship.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Source code available at https://github.com/tvlearn.

References

  1. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: NeurIPS (2007)

    Google Scholar 

  2. Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv:1308.3432 (2013)

  3. Berliner, A., Rotman, G., Adi, Y., Reichart, R., Hazan, T.: Learning discrete structured variational auto-encoder using natural evolution strategies. In: ICLR (2022)

    Google Scholar 

  4. Bingham, E., et al.: PYRO: deep universal probabilistic programming. JMLR 20(1), 973–978 (2019)

    Google Scholar 

  5. Bowman, S., Vilnis, L., Vinyals, O., Dai, A.M., Jozefowicz, R., Bengio, S.: Generating Sentences from a Continuous Space. In: CoNLL (2016)

    Google Scholar 

  6. Chaudhury, S., Roy, H.: Can fully convolutional networks perform well for general image restoration problems? In: IAPR International Conference MVA (2017)

    Google Scholar 

  7. Child, R., Gray, S., Radford, A., Sutskever, I.: Generating long sequences with sparse transformers. arXiv:1904.10509 (2019)

  8. Cong, Y., Zhao, M., Bai, K., Carin, L.: GO gradient for expectation-based objectives. In: ICLR (2019)

    Google Scholar 

  9. Cremer, C., Li, X., Duvenaud, D.: Inference suboptimality in variational autoencoders. In: ICML (2018)

    Google Scholar 

  10. Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising by sparse 3D transform-domain collaborative filtering. IEEE Trans. Image Proc. 16(8), 2080–2095 (2007)

    Article  Google Scholar 

  11. Dimitriev, A., Zhou, M.: CARMS: categorical-antithetic-reinforce multi-sample gradient estimator. In: NeurIPS (2021)

    Google Scholar 

  12. Dong, W., Wang, P., Yin, W., Shi, G., Wu, F., Lu, X.: Denoising prior driven deep neural network for image restoration. TPAMI 41(10), 2305–2318 (2019)

    Article  Google Scholar 

  13. Dong, Z., Mnih, A., Tucker, G.: Coupled gradient estimators for discrete latent variables. In: NeurIPS (2021)

    Google Scholar 

  14. Drefs, J., Guiraud, E., Lücke, J.: Evolutionary Variational Optimization of Generative Models. JMLR 23(21), 1–51 (2022)

    MathSciNet  MATH  Google Scholar 

  15. Elad, M., Aharon, M.: Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Proc. 15, 3736–3745 (2006)

    Article  MathSciNet  Google Scholar 

  16. Eldar, Y.C., Kutyniok, G.: Compressed Sensing: Theory and Applications. Cambridge University Press, Cambridge (2012)

    Book  Google Scholar 

  17. Exarchakis, G., Oubari, O., Lenz, G.: A sampling-based approach for efficient clustering in large datasets. In: CVPR (2022)

    Google Scholar 

  18. Fajtl, J., Argyriou, V., Monekosso, D., Remagnino, P.: Latent bernoulli autoencoder. In: ICML (2020)

    Google Scholar 

  19. Foerster, J., Farquhar, G., Al-Shedivat, M., Rocktäschel, T., Xing, E., Whiteson, S.: DiCE: the infinitely differentiable monte Carlo estimator. In: ICML (2018)

    Google Scholar 

  20. Goodfellow, I.J., Courville, A., Bengio, Y.: Scaling up spike-and-slab models for unsupervised feature learning. TPAMI 35(8), 1902–1914 (2013)

    Article  Google Scholar 

  21. Goodfellow, I., et al.: Generative adversarial nets. In: NeurIPS (2014)

    Google Scholar 

  22. Grathwohl, W., Choi, D., Wu, Y., Roeder, G., Duvenaud, D.: Backpropagation through the void: optimizing control variates for black-box gradient estimation. In: ICLR (2018)

    Google Scholar 

  23. Gu, S., Zhang, L., Zuo, W., Feng, X.: Weighted nuclear norm minimization with application to image denoising. In: CVPR (2014)

    Google Scholar 

  24. Hajewski, J., Oliveira, S.: An evolutionary approach to variational autoencoders. In: CCWC (2020)

    Google Scholar 

  25. van Hateren, J.H., van der Schaaf, A.: Independent component filters of natural images compared with simple cells in primary visual cortex. Proc. Roy. Soc. London Ser. B 265, 359–66 (1998)

    Article  Google Scholar 

  26. Hirschberger, F., Forster, D., Lücke, J.: A variational EM acceleration for efficient clustering at very large scales. TPAMI (2022)

    Google Scholar 

  27. Hughes, M.C., Sudderth, E.B.: Fast learning of clusters and topics via sparse posteriors. arXiv:1609.07521 (2016)

  28. Imamura, R., Itasaka, T., Okuda, M.: Zero-shot hyperspectral image denoising with separable image prior. In: ICCV Workshops (2019)

    Google Scholar 

  29. Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. In: ICLR (2017)

    Google Scholar 

  30. Kim, Y., Wiseman, S., Miller, A., Sontag, D., Rush, A.: Semi-amortized variational autoencoders. In: ICML (2018)

    Google Scholar 

  31. Kingma, D.P., Salimans, T., Poole, B., Ho, J.: Variational diffusion models. In: NeurIPS (2021)

    Google Scholar 

  32. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014)

    Google Scholar 

  33. Kiran, B., Thomas, D., Parakkal, R.: An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos. J. Imaging 4(2), 36 (2018)

    Article  Google Scholar 

  34. Kool, W., van Hoof, H., Welling, M.: Estimating gradients for discrete random variables by sampling without Replacement. In: ICLR (2020)

    Google Scholar 

  35. van Krieken, E., Tomczak, J.M., Teije, A.T.: Storchastic: a framework for general stochastic automatic differentiation. In: NeurIPS (2021)

    Google Scholar 

  36. Krull, A., Buchholz, T.O., Jug, F.: Noise2Void - learning denoising from single noisy images. In: CVPR (2019)

    Google Scholar 

  37. Lehtinen, J., et al.: Noise2Noise: learning image restoration without clean data. In: ICML (2018)

    Google Scholar 

  38. Liu, R., Regier, J., Tripuraneni, N., Jordan, M., Mcauliffe, J.: Rao-Blackwellized stochastic gradients for discrete distributions. In: ICML (2019)

    Google Scholar 

  39. Lorberbom, G., Gane, A., Jaakkola, T., Hazan, T.: Direct Optimization through \(\arg \max \) for discrete variational auto-encoder. In: NeurIPS (2019)

    Google Scholar 

  40. Maaløe, L., Sønderby, C.K., Sønderby, S.K., Winther, O.: Auxiliary deep generative models. In: ICML (2016)

    Google Scholar 

  41. Maddison, C.J., Mnih, A., Teh, Y.W.: The concrete distribution: a continuous relaxation of discrete random variables. In: ICLR (2017)

    Google Scholar 

  42. Mairal, J., Elad, M., Sapiro, G.: Sparse representation for color image restoration. IEEE Trans. Image Proc. 17(1), 53–69 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  43. Olshausen, B., Field, D.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–9 (1996)

    Article  Google Scholar 

  44. van den Oord, A., Vinyals, O., Kavukcuoglu, K.: Neural discrete representation learning. In: NeurIPS (2017)

    Google Scholar 

  45. Paiton, D.M., Frye, C.G., Lundquist, S.Y., Bowen, J.D., Zarcone, R., Olshausen, B.A.: Selectivity and robustness of sparse coding networks. J. Vis. 20(12), 10–10 (2020)

    Article  Google Scholar 

  46. Papyan, V., Romano, Y., Sulam, J., Elad, M.: Convolutional dictionary learning via local processing. In: ICCV (2017)

    Google Scholar 

  47. Park, Y., Kim, C., Kim, G.: Variational laplace autoencoders. In: ICML (2019)

    Google Scholar 

  48. Paulus, M.B., Maddison, C.J., Krause, A.: Rao-blackwellizing the straight-through gumbel-softmax gradient estimator. In: ICLR (2021)

    Google Scholar 

  49. Potapczynski, A., Loaiza-Ganem, G., Cunningham, J.P.: Invertible Gaussian reparameterization: revisiting the gumbel-softmax. In: NeurIPS (2020)

    Google Scholar 

  50. Quan, Y., Chen, M., Pang, T., Ji, H.: Self2Self With Dropout: learning self-supervised denoising from single image. In: CVPR (2020)

    Google Scholar 

  51. Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: ICML (2014)

    Google Scholar 

  52. Roberts, A., Engel, J., Raffel, C., Hawthorne, C., Eck, D.: A hierarchical latent vector model for learning long-term structure in music. In: ICML (2018)

    Google Scholar 

  53. Rolfe, J.T.: Discrete variational autoencoders. In: ICLR (2017)

    Google Scholar 

  54. Schulman, J., Heess, N., Weber, T., Abbeel, P.: Gradient estimation using stochastic computation graphs. In: NeurIPS (2015)

    Google Scholar 

  55. Sheikh, A.S., Lücke, J.: Select-and-sample for spike-and-slab sparse coding. In: NeurIPS (2016)

    Google Scholar 

  56. Sheikh, A.S., Shelton, J.A., Lücke, J.: A truncated EM approach for spike-and-slab sparse coding. JMLR 15, 2653–2687 (2014)

    MathSciNet  MATH  Google Scholar 

  57. Shelton, J., Bornschein, J., Sheikh, A.S., Berkes, P., Lücke, J.: Select and sample - a model of efficient neural inference and learning. In: NeurIPS (2011)

    Google Scholar 

  58. Shelton, J.A., Gasthaus, J., Dai, Z., Lücke, J., Gretton, A.: GP-Select: accelerating EM using adaptive subspace preselection. Neural Comp. 29(8), 2177–2202 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  59. Shocher, A., Cohen, N., Irani, M.: “Zero-Shot” super-resolution using deep internal learning. In: CVPR (2018)

    Google Scholar 

  60. Sulam, J., Muthukumar, R., Arora, R.: Adversarial robustness of supervised sparse coding. In: NeurIPS (2020)

    Google Scholar 

  61. Titsias, M.K., Lázaro-Gredilla, M.: Spike and slab variational inference for multi-task and multiple kernel learning. In: NeurIPS (2011)

    Google Scholar 

  62. Tomczak, J.M., Welling, M.: VAE with a VampPrior. In: AISTATS (2018)

    Google Scholar 

  63. Tonolini, F., Jensen, B.S., Murray-Smith, R.: Variational sparse coding. In: UAI (2020)

    Google Scholar 

  64. Ulyanov, D., Vedaldi, A., Lempitsky, V.: Deep image prior. In: CVPR (2018)

    Google Scholar 

  65. Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3), 229–256 (1992)

    Article  MATH  Google Scholar 

  66. Yu, G., Sapiro, G., Mallat, S.: Solving inverse problems with piecewise linear estimators: from gaussian mixture models to structured sparsity. IEEE Trans. Image Proc. 21(5), 2481–2499 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  67. Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian Denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Proc. 26(7), 3142–3155 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  68. Zhou, M., Chen, H., Paisley, J., Ren, L., Sapiro, G., Carin, L.: Non-Parametric Bayesian dictionary learning for sparse image representations. In: NeurIPS (2009)

    Google Scholar 

  69. Zhou, M., et al.: Nonparametric Bayesian dictionary learning for analysis of noisy and incomplete images. IEEE Trans. Image Proc. 21(1), 130–144 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  70. Zhu, S., Xu, G., Cheng, Y., Han, X., Wang, Z.: BDGAN: image blind denoising using generative adversarial networks. In: PRCV (2019)

    Google Scholar 

  71. Zoran, D., Weiss, Y.: From learning models of natural image patches to whole image restoration. In: ICCV (2011)

    Google Scholar 

Download references

Acknowledgments

We acknowledge funding by the German Research Foundation (DFG) under grant SFB 1330/1 &2 (B2), ID 352015383 (JD), and the H4a cluster of excellence EXC 2177/1, ID 390895286 (JL), and by the German Federal Ministry of Education and Research (BMBF) through a Wolfgang Gentner scholarship (awarded to EG, ID 13E18CHA). Furthermore, computing infrastructure support is acknowledged through the HPC Cluster CARL at UOL (DFG, grant INST 184/157-1 FUGG) and the HLRN Alliance, grant ID nim00006.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jakob Drefs or Enrico Guiraud .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 3359 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Drefs, J., Guiraud, E., Panagiotou, F., Lücke, J. (2023). Direct Evolutionary Optimization of Variational Autoencoders with Binary Latents. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13715. Springer, Cham. https://doi.org/10.1007/978-3-031-26409-2_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-26409-2_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-26408-5

  • Online ISBN: 978-3-031-26409-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics