Learning Energy-Based Models with Adversarial Training

Yin, Xuwang; Li, Shiying; Rohde, Gustavo K.

doi:10.1007/978-3-031-20065-6_13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13665))

Included in the following conference series:

European Conference on Computer Vision

2049 Accesses
1 Citations

Abstract

We study a new approach to learning energy-based models (EBMs) based on adversarial training (AT). We show that (binary) AT learns a special kind of energy function that models the support of the data distribution, and the learning process is closely related to MCMC-based maximum likelihood learning of EBMs. We further propose improved techniques for generative modeling with AT, and demonstrate that this new approach is capable of generating diverse and realistic images. Aside from having competitive image generation performance to explicit EBMs, the studied approach is stable to train, is well-suited for image translation tasks, and exhibits strong out-of-distribution adversarial robustness. Our results demonstrate the viability of the AT approach to generative modeling, suggesting that AT is a competitive alternative approach to learning EBMs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Arbel, M., Zhou, L., Gretton, A.: Generalized energy based models. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=0PtUPB9z6qK
Augustin, M., Meinke, A., Hein, M.: Adversarial robustness on in- and out-distribution improves explainability. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12371, pp. 228–245. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58574-7_14
Chapter Google Scholar
Bitterwolf, J., Meinke, A., Hein, M.: Certifiably adversarially robust detection of out-of-distribution data. Adv. Neural Inf. Process. Syst. 33, 16085–16095 (2020)
Google Scholar
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. In: International Conference on Learning Representations (2019). https://openreview.net/forum?id=B1xsqj09Fm
Ceylan, C., Gutmann, M.U.: Conditional noise-contrastive estimation of unnormalised models. In: International Conference on Machine Learning, pp. 726–734. PMLR (2018)
Google Scholar
Choi, Y., Uh, Y., Yoo, J., Ha, J.W.: Stargan v2: diverse image synthesis for multiple domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8188–8197 (2020)
Google Scholar
Croce, F., Hein, M.: Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: ICML (2020)
Google Scholar
Du, Y., Li, S., Mordatch, I.: Compositional visual generation with energy based models. Adv. Neural Inf. Process. Syst. 33, 6637–6647 (2020)
Google Scholar
Du, Y., Li, S., Tenenbaum, J.B., Mordatch, I.: Improved contrastive divergence training of energy based models. In: ICML (2021)
Google Scholar
Du, Y., Mordatch, I.: Implicit generation and modeling with energy based models. Adv. Neural Inf. Process. Syst. 32, 1–11 (2019). https://proceedings.neurips.cc/paper/2019/file/378a063b8fdb1db941e34f4bde584c7d-Paper.pdf
Engstrom, L., Ilyas, A., Santurkar, S., Tsipras, D., Tran, B., Madry, A.: Adversarial robustness as a prior for learned representations. arXiv preprint arXiv:1906.00945 (2019)
Gao, R., Song, Y., Poole, B., Wu, Y.N., Kingma, D.P.: Learning energy-based models by diffusion recovery likelihood. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=v_1Soh8QUNc
Goodfellow, I., et al.: Generative adversarial nets. Adv. Neural Inf. Process. Syst., 2672–2680 (2014)
Google Scholar
Grathwohl, W., Wang, K.C., Jacobsen, J.H., Duvenaud, D., Norouzi, M., Swersky, K.: Your classifier is secretly an energy based model and you should treat it like one. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=Hkxzx0NtDB
Grathwohl, W.S., Kelly, J.J., Hashemi, M., Norouzi, M., Swersky, K., Duvenaud, D.: No mcmc for me: amortized sampling for fast and stable training of energy-based models. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=ixpSxO9flk3
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein gans. Adv. Neural Inf. Process. Syst. 30 (2017). https://proceedings.neurips.cc/paper/2017/file/892c3b1c6dccd52936e27cbd0ff683d6-Paper.pdf
Gutmann, M., Hyvärinen, A.: Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 297–304. JMLR Workshop and Conference Proceedings (2010)
Google Scholar
Han, T., Nijkamp, E., Fang, X., Hill, M., Zhu, S.C., Wu, Y.N.: Divergence triangle for joint training of generator model, energy-based model, and inferential model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8670–8679 (2019)
Google Scholar
Han, T., Nijkamp, E., Zhou, L., Pang, B., Zhu, S.C., Wu, Y.N.: Joint training of variational auto-encoder and latent energy-based model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7978–7987 (2020)
Google Scholar
Hein, M., Andriushchenko, M., Bitterwolf, J.: Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the problem. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 41–50 (2019)
Google Scholar
Hendrycks, D., Mazeika, M., Dietterich, T.: Deep anomaly detection with outlier exposure. arXiv preprint arXiv:1812.04606 (2018)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. Adv. Neural Inf. Process. Syst. 30, 6626–6637 (2017)
Google Scholar
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 33 (2020). https://proceedings.neurips.cc/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf
Hyvärinen, A., Dayan, P.: Estimation of non-normalized statistical models by score matching. J. Mach. Learn. Res. 6(4) (2005)
Google Scholar
Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Adversarial examples are not bugs, they are features. arXiv preprint arXiv:1905.02175 (2019)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Google Scholar
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017)
Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J., Aila, T.: Training generative adversarial networks with limited data. Adv. Neural Inf. Process. Syst. 33, 12104–12114 (2020). https://proceedings.neurips.cc/paper/2020/file/8d30aa96e72440759f74bd2306c1fa3d-Paper.pdf
Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1\(\times \)1 convolutions. Adv. Neural Inf. Process. Syst., 10215–10224 (2018)
Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Kumar, R., Ozair, S., Goyal, A., Courville, A., Bengio, Y.: Maximum entropy generators for energy-based models. arXiv preprint arXiv:1901.08508 (2019)
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236 (2016)
LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M., Huang, F.: A tutorial on energy-based learning. Predict. Struct. Data 1(0) (2006)
Google Scholar
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)
Meinke, A., Hein, M.: Towards neural networks that provably know when they don’t know. arXiv preprint arXiv:1909.12180 (2019)
Mescheder, L., Geiger, A., Nowozin, S.: Which training methods for gans do actually converge? In: International Conference on Machine Learning, pp. 3481–3490. PMLR (2018)
Google Scholar
Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=B1QRgziT-
Nijkamp, E., et al.: MCMC should mix: learning energy-based model with flow-based backbone. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=4C93Qvn-tz
Nijkamp, E., Hill, M., Han, T., Zhu, S.C., Wu, Y.N.: On the anatomy of mcmc-based maximum likelihood learning of energy-based models. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 5272–5280 (2020)
Google Scholar
Nijkamp, E., Hill, M., Zhu, S.C., Wu, Y.N.: Learning non-convergent non-persistent short-run mcmc toward energy-based model. In: NeurIPS (2019)
Google Scholar
Pang, B., Han, T., Nijkamp, E., Zhu, S.C., Wu, Y.N.: Learning latent space energy-based prior model. Adv. Neural Inf. Process. Syst. 33 (2020). https://proceedings.neurips.cc/paper/2020/file/fa3060edb66e6ff4507886f9912e1ab9-Paper.pdf
Pidhorskyi, S., Adjeroh, D.A., Doretto, G.: Adversarial latent autoencoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14104–14113 (2020)
Google Scholar
Ramachandran, P., Zoph, B., Le, Q.V.: Swish: a self-gated activation function, vol. 7, no. 1. arXiv preprint arXiv:1710.05941 (2017)
Rhodes, B., Xu, K., Gutmann, M.U.: Telescoping density-ratio estimation. arXiv preprint arXiv:2006.12204 (2020)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training gans. Adv. Neural Inf. Process. Syst. 29, 2234–2242 (2016)
Google Scholar
Salimans, T., Kingma, D.P.: Weight normalization: a simple reparameterization to accelerate training of deep neural networks. Adv. Neural Inf. Process. Syst. 29, 901–909 (2016)
Google Scholar
Santurkar, S., Ilyas, A., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Image synthesis with a single (robust) classifier. Adv. Neural Inf. Process. Syst., 1260–1271 (2019)
Google Scholar
Sehwag, V., et al.: Better the devil you know: an analysis of evasion attacks using out-of-distribution adversarial examples. arXiv preprint arXiv:1905.01726 (2019)
Song, Y., Ermon, S.: Generative modeling by estimating gradients of the data distribution. Adv. Neural Inf. Process. Syst. 32 (2019). https://proceedings.neurips.cc/paper/2019/file/3001ef257407d5a371a96dcd947c7d93-Paper.pdf
Song, Y., Ermon, S.: Improved techniques for training score-based generative models. arXiv preprint arXiv:2006.09011 (2020)
Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=PxTIG12RRHS
Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)
Article Google Scholar
Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., Madry, A.: Robustness may be at odds with accuracy. arXiv preprint arXiv:1805.12152 (2018)
Vahdat, A., Kautz, J.: Nvae: a deep hierarchical variational autoencoder. Adv. Neural Inf. Process. Syst. 33 (2020). https://proceedings.neurips.cc/paper/2020/file/e3b21256183cf7c2c7a66be163579d37-Paper.pdf
Wang, Y., Wang, Y., Yang, J., Lin, Z.: A unified contrastive energy-based model for understanding the generative ability of adversarial training. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=XhF2VOMRHS
Welling, M., Teh, Y.W.: Bayesian learning via stochastic gradient langevin dynamics. In: Proceedings of the 28th International Conference on Machine Learning (ICML-2011), pp. 681–688. Citeseer (2011)
Google Scholar
Xiao, Z., Kreis, K., Kautz, J., Vahdat, A.: Vaebm: a symbiosis between variational autoencoders and energy-based models. In: International Conference on Learning Representations (2021), https://openreview.net/forum?id=5m3SEczOV8L
Xie, J., Lu, Y., Gao, R., Zhu, S.C., Wu, Y.N.: Cooperative training of descriptor and generator networks. IEEE Trans. Pattern Anal. Mach. Intell. 42(1), 27–45 (2018)
Article Google Scholar
Xie, J., Lu, Y., Zhu, S.C., Wu, Y.: A theory of generative convnet. In: International Conference on Machine Learning, pp. 2635–2644. PMLR (2016)
Google Scholar
Xie, J., Zheng, Z., Fang, X., Zhu, S.C., Wu, Y.N.: Cooperative training of fast thinking initializer and slow thinking solver for conditional learning. IEEE Trans. Pattern Anal. Mach. Intell. (2021)
Google Scholar
Xie, J., Zheng, Z., Gao, R., Wang, W., Zhu, S.C., Wu, Y.N.: Learning descriptor networks for 3D shape synthesis and analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8629–8638 (2018)
Google Scholar
Xie, J., Zheng, Z., Gao, R., Wang, W., Zhu, S.C., Wu, Y.N.: Generative voxelnet: learning energy-based models for 3D shape synthesis and analysis. IEEE Trans. Pattern Anal. Mach. Intell. (2020)
Google Scholar
Xie, J., Zheng, Z., Li, P.: Learning energy based model with variational auto-encoder as amortized sampler. In: The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI), vol. 2 (2021)
Google Scholar
Xie, J., Zhu, Y., Li, J., Li, P.: A tale of two flows: cooperative learning of langevin flow and normalizing flow toward energy-based model. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=31d5RLCUuXC
Yin, X., Kolouri, S., Rohde, G.K.: Gat: Generative adversarial training for adversarial example detection and robust classification. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=SJeQEp4YDH
Yu, F., Seff, A., Zhang, Y., Song, S., Funkhouser, T., Xiao, J.: Lsun: construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365 (2015)
Zhao, Y., Xie, J., Li, P.: Learning energy-based generative models via coarse-to-fine expanding and sampling. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=aD1_5zowqV
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Virginia, Charlottesville, USA
Xuwang Yin, Shiying Li & Gustavo K. Rohde

Authors

Xuwang Yin
View author publications
You can also search for this author in PubMed Google Scholar
Shiying Li
View author publications
You can also search for this author in PubMed Google Scholar
Gustavo K. Rohde
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuwang Yin .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 19690 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yin, X., Li, S., Rohde, G.K. (2022). Learning Energy-Based Models with Adversarial Training. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13665. Springer, Cham. https://doi.org/10.1007/978-3-031-20065-6_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-20065-6_13
Published: 03 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20064-9
Online ISBN: 978-3-031-20065-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning Energy-Based Models with Adversarial Training