Abstract
Many types of data are generated at least partly by discrete causes. Deep generative models such as variational autoencoders (VAEs) with binary latents consequently became of interest. Because of discrete latents, standard VAE training is not possible, and the goal of previous approaches has therefore been to amend (i.e, typically anneal) discrete priors to allow for a training analogously to conventional VAEs. Here, we divert more strongly from conventional VAE optimization: We ask if the discrete nature of the latents can be fully maintained by applying a direct, discrete optimization for the encoding model. In doing so, we sidestep standard VAE mechanisms such as sampling approximation, reparameterization and amortization. Direct optimization of VAEs is enabled by a combination of evolutionary algorithms and truncated posteriors as variational distributions. Such a combination has recently been suggested, and we here for the first time investigate how it can be applied to a deep model. Concretely, we (A) tie the variational method into gradient ascent for network weights, and (B) show how the decoder is used for the optimization of variational parameters. Using image data, we observed the approach to result in much sparser codes compared to conventionally trained binary VAEs. Considering the for sparse codes prototypical application to image patches, we observed very competitive performance in tasks such as ‘zero-shot’ denoising and inpainting. The dense codes emerging from conventional VAE optimization, on the other hand, seem preferable on other data, e.g., collections of images of whole single objects (CIFAR etc.), but less preferable for image patches. More generally, the realization of a very different type of optimization for binary VAEs allows for investigating advantages and disadvantages of the training method itself. And we here observed a strong influence of the method on the learned encoding with significant impact on VAE performance for different tasks.
J. Drefs and E. Guiraud—Joint first authorship.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Source code available at https://github.com/tvlearn.
References
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: NeurIPS (2007)
Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv:1308.3432 (2013)
Berliner, A., Rotman, G., Adi, Y., Reichart, R., Hazan, T.: Learning discrete structured variational auto-encoder using natural evolution strategies. In: ICLR (2022)
Bingham, E., et al.: PYRO: deep universal probabilistic programming. JMLR 20(1), 973–978 (2019)
Bowman, S., Vilnis, L., Vinyals, O., Dai, A.M., Jozefowicz, R., Bengio, S.: Generating Sentences from a Continuous Space. In: CoNLL (2016)
Chaudhury, S., Roy, H.: Can fully convolutional networks perform well for general image restoration problems? In: IAPR International Conference MVA (2017)
Child, R., Gray, S., Radford, A., Sutskever, I.: Generating long sequences with sparse transformers. arXiv:1904.10509 (2019)
Cong, Y., Zhao, M., Bai, K., Carin, L.: GO gradient for expectation-based objectives. In: ICLR (2019)
Cremer, C., Li, X., Duvenaud, D.: Inference suboptimality in variational autoencoders. In: ICML (2018)
Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising by sparse 3D transform-domain collaborative filtering. IEEE Trans. Image Proc. 16(8), 2080–2095 (2007)
Dimitriev, A., Zhou, M.: CARMS: categorical-antithetic-reinforce multi-sample gradient estimator. In: NeurIPS (2021)
Dong, W., Wang, P., Yin, W., Shi, G., Wu, F., Lu, X.: Denoising prior driven deep neural network for image restoration. TPAMI 41(10), 2305–2318 (2019)
Dong, Z., Mnih, A., Tucker, G.: Coupled gradient estimators for discrete latent variables. In: NeurIPS (2021)
Drefs, J., Guiraud, E., Lücke, J.: Evolutionary Variational Optimization of Generative Models. JMLR 23(21), 1–51 (2022)
Elad, M., Aharon, M.: Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Proc. 15, 3736–3745 (2006)
Eldar, Y.C., Kutyniok, G.: Compressed Sensing: Theory and Applications. Cambridge University Press, Cambridge (2012)
Exarchakis, G., Oubari, O., Lenz, G.: A sampling-based approach for efficient clustering in large datasets. In: CVPR (2022)
Fajtl, J., Argyriou, V., Monekosso, D., Remagnino, P.: Latent bernoulli autoencoder. In: ICML (2020)
Foerster, J., Farquhar, G., Al-Shedivat, M., Rocktäschel, T., Xing, E., Whiteson, S.: DiCE: the infinitely differentiable monte Carlo estimator. In: ICML (2018)
Goodfellow, I.J., Courville, A., Bengio, Y.: Scaling up spike-and-slab models for unsupervised feature learning. TPAMI 35(8), 1902–1914 (2013)
Goodfellow, I., et al.: Generative adversarial nets. In: NeurIPS (2014)
Grathwohl, W., Choi, D., Wu, Y., Roeder, G., Duvenaud, D.: Backpropagation through the void: optimizing control variates for black-box gradient estimation. In: ICLR (2018)
Gu, S., Zhang, L., Zuo, W., Feng, X.: Weighted nuclear norm minimization with application to image denoising. In: CVPR (2014)
Hajewski, J., Oliveira, S.: An evolutionary approach to variational autoencoders. In: CCWC (2020)
van Hateren, J.H., van der Schaaf, A.: Independent component filters of natural images compared with simple cells in primary visual cortex. Proc. Roy. Soc. London Ser. B 265, 359–66 (1998)
Hirschberger, F., Forster, D., Lücke, J.: A variational EM acceleration for efficient clustering at very large scales. TPAMI (2022)
Hughes, M.C., Sudderth, E.B.: Fast learning of clusters and topics via sparse posteriors. arXiv:1609.07521 (2016)
Imamura, R., Itasaka, T., Okuda, M.: Zero-shot hyperspectral image denoising with separable image prior. In: ICCV Workshops (2019)
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. In: ICLR (2017)
Kim, Y., Wiseman, S., Miller, A., Sontag, D., Rush, A.: Semi-amortized variational autoencoders. In: ICML (2018)
Kingma, D.P., Salimans, T., Poole, B., Ho, J.: Variational diffusion models. In: NeurIPS (2021)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014)
Kiran, B., Thomas, D., Parakkal, R.: An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos. J. Imaging 4(2), 36 (2018)
Kool, W., van Hoof, H., Welling, M.: Estimating gradients for discrete random variables by sampling without Replacement. In: ICLR (2020)
van Krieken, E., Tomczak, J.M., Teije, A.T.: Storchastic: a framework for general stochastic automatic differentiation. In: NeurIPS (2021)
Krull, A., Buchholz, T.O., Jug, F.: Noise2Void - learning denoising from single noisy images. In: CVPR (2019)
Lehtinen, J., et al.: Noise2Noise: learning image restoration without clean data. In: ICML (2018)
Liu, R., Regier, J., Tripuraneni, N., Jordan, M., Mcauliffe, J.: Rao-Blackwellized stochastic gradients for discrete distributions. In: ICML (2019)
Lorberbom, G., Gane, A., Jaakkola, T., Hazan, T.: Direct Optimization through \(\arg \max \) for discrete variational auto-encoder. In: NeurIPS (2019)
Maaløe, L., Sønderby, C.K., Sønderby, S.K., Winther, O.: Auxiliary deep generative models. In: ICML (2016)
Maddison, C.J., Mnih, A., Teh, Y.W.: The concrete distribution: a continuous relaxation of discrete random variables. In: ICLR (2017)
Mairal, J., Elad, M., Sapiro, G.: Sparse representation for color image restoration. IEEE Trans. Image Proc. 17(1), 53–69 (2008)
Olshausen, B., Field, D.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–9 (1996)
van den Oord, A., Vinyals, O., Kavukcuoglu, K.: Neural discrete representation learning. In: NeurIPS (2017)
Paiton, D.M., Frye, C.G., Lundquist, S.Y., Bowen, J.D., Zarcone, R., Olshausen, B.A.: Selectivity and robustness of sparse coding networks. J. Vis. 20(12), 10–10 (2020)
Papyan, V., Romano, Y., Sulam, J., Elad, M.: Convolutional dictionary learning via local processing. In: ICCV (2017)
Park, Y., Kim, C., Kim, G.: Variational laplace autoencoders. In: ICML (2019)
Paulus, M.B., Maddison, C.J., Krause, A.: Rao-blackwellizing the straight-through gumbel-softmax gradient estimator. In: ICLR (2021)
Potapczynski, A., Loaiza-Ganem, G., Cunningham, J.P.: Invertible Gaussian reparameterization: revisiting the gumbel-softmax. In: NeurIPS (2020)
Quan, Y., Chen, M., Pang, T., Ji, H.: Self2Self With Dropout: learning self-supervised denoising from single image. In: CVPR (2020)
Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: ICML (2014)
Roberts, A., Engel, J., Raffel, C., Hawthorne, C., Eck, D.: A hierarchical latent vector model for learning long-term structure in music. In: ICML (2018)
Rolfe, J.T.: Discrete variational autoencoders. In: ICLR (2017)
Schulman, J., Heess, N., Weber, T., Abbeel, P.: Gradient estimation using stochastic computation graphs. In: NeurIPS (2015)
Sheikh, A.S., Lücke, J.: Select-and-sample for spike-and-slab sparse coding. In: NeurIPS (2016)
Sheikh, A.S., Shelton, J.A., Lücke, J.: A truncated EM approach for spike-and-slab sparse coding. JMLR 15, 2653–2687 (2014)
Shelton, J., Bornschein, J., Sheikh, A.S., Berkes, P., Lücke, J.: Select and sample - a model of efficient neural inference and learning. In: NeurIPS (2011)
Shelton, J.A., Gasthaus, J., Dai, Z., Lücke, J., Gretton, A.: GP-Select: accelerating EM using adaptive subspace preselection. Neural Comp. 29(8), 2177–2202 (2017)
Shocher, A., Cohen, N., Irani, M.: “Zero-Shot” super-resolution using deep internal learning. In: CVPR (2018)
Sulam, J., Muthukumar, R., Arora, R.: Adversarial robustness of supervised sparse coding. In: NeurIPS (2020)
Titsias, M.K., Lázaro-Gredilla, M.: Spike and slab variational inference for multi-task and multiple kernel learning. In: NeurIPS (2011)
Tomczak, J.M., Welling, M.: VAE with a VampPrior. In: AISTATS (2018)
Tonolini, F., Jensen, B.S., Murray-Smith, R.: Variational sparse coding. In: UAI (2020)
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Deep image prior. In: CVPR (2018)
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3), 229–256 (1992)
Yu, G., Sapiro, G., Mallat, S.: Solving inverse problems with piecewise linear estimators: from gaussian mixture models to structured sparsity. IEEE Trans. Image Proc. 21(5), 2481–2499 (2012)
Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian Denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Proc. 26(7), 3142–3155 (2017)
Zhou, M., Chen, H., Paisley, J., Ren, L., Sapiro, G., Carin, L.: Non-Parametric Bayesian dictionary learning for sparse image representations. In: NeurIPS (2009)
Zhou, M., et al.: Nonparametric Bayesian dictionary learning for analysis of noisy and incomplete images. IEEE Trans. Image Proc. 21(1), 130–144 (2012)
Zhu, S., Xu, G., Cheng, Y., Han, X., Wang, Z.: BDGAN: image blind denoising using generative adversarial networks. In: PRCV (2019)
Zoran, D., Weiss, Y.: From learning models of natural image patches to whole image restoration. In: ICCV (2011)
Acknowledgments
We acknowledge funding by the German Research Foundation (DFG) under grant SFB 1330/1 &2 (B2), ID 352015383 (JD), and the H4a cluster of excellence EXC 2177/1, ID 390895286 (JL), and by the German Federal Ministry of Education and Research (BMBF) through a Wolfgang Gentner scholarship (awarded to EG, ID 13E18CHA). Furthermore, computing infrastructure support is acknowledged through the HPC Cluster CARL at UOL (DFG, grant INST 184/157-1 FUGG) and the HLRN Alliance, grant ID nim00006.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Drefs, J., Guiraud, E., Panagiotou, F., Lücke, J. (2023). Direct Evolutionary Optimization of Variational Autoencoders with Binary Latents. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13715. Springer, Cham. https://doi.org/10.1007/978-3-031-26409-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-031-26409-2_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26408-5
Online ISBN: 978-3-031-26409-2
eBook Packages: Computer ScienceComputer Science (R0)