The physics of energy-based models

Huembeli, Patrick; Arrazola, Juan Miguel; Killoran, Nathan; Mohseni, Masoud; Wittek, Peter

doi:10.1007/s42484-021-00057-7

The physics of energy-based models

Research Article
Published: 06 January 2022

Volume 4, article number 1, (2022)
Cite this article

Quantum Machine Intelligence Aims and scope Submit manuscript

Patrick Huembeli ORCID: orcid.org/0000-0001-7047-6897¹,
Juan Miguel Arrazola²,
Nathan Killoran²,
Masoud Mohseni³ &
…
Peter Wittek^4,5,6,7

964 Accesses
8 Citations
Explore all metrics

Abstract

Energy-based models (EBMs) are experiencing a resurgence of interest in both the physics community and the machine learning community. This article provides an intuitive introduction to EBMs, without requiring any background in machine learning, connecting elementary concepts from physics with basic concepts and tools in generative models, and finally giving a perspective where current research in the field is heading. This article, in its original form, was written as an online lecture note in HTML and Javascript and contains interactive graphics. We recommend the reader to also visit the interactive version.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

The entropy of a system is given by $S = \sum _x -P(x)\log P(x)$, where P(x) is the probability of a state x. If all states are equally likely, the entropy is maximal. If only one state $P(x) = 1$ then the entropy is minimal.
By definition, being at thermal equilibrium with a bath of temperature T means that the system is also at tempearature T. The temperature determines the average energy of a system. With large T, the probability of high energy states increases, and so does the average energy.
For any distribution in the exponential family, a statistic T(x) is sufficient if we can write the probability p(x) as
$$\begin{aligned}p(x) = \exp \left( \alpha (\theta ) T(x) + A(\theta ) \right) ,\end{aligned}$$
where $\alpha (\theta )$ is a vector-valued function and $A(\theta )$ is a scalar, which for a Boltzmann distribution is related to the partition function as $A(\theta ) = \log (1/Z)$ (Li et al. 2013).
There are other conventions for the Ising energy function in the literature, where the signs of the energy terms change. For example,
$$\begin{aligned} E(\sigma ) = \sum \limits _i b_i \sigma _i + \sum \limits _{ij} w_{ij} \sigma _i\sigma _j. \end{aligned}$$
We follow the convention introduced above.
Non-commuting operators do not have a common eigenfunction or eigenstate. Since these eigenfunctions in quantum mechanics (for hermitian operators) are orthogonal, non-commutating operators lead to non-orthogonal eigenfunctions or eigenstates and therefore there is no measurement operator that can reliably distinguish these non-orthogonal states. This gives rise to the uncertainty principle. The no-cloning theorem follows a similar argument. Orthogonal states can be cloned, but non-orthogonal ones cannot.
Droplet is a local low-energy cluster of spins where the distribution is disconnected from the rest of the system.

References

Amin MH, Andriyash E, Rolfe J, Kulchytskyy B, Melko R (2018) Quantum Boltzmann machine. Physical Review X 8(2):021050
Article Google Scholar
Aurell E, Ekeberg M (2012) Inverse Ising inference using all the data. PhysicaL Review Letters 108(9):090201
Article Google Scholar
Amit DJ, Gutfreund H, Sompolinsky H (1985) Spin-glass models of neural networks. Physical Review A 32(2):1007
Article MathSciNet Google Scholar
Ackley DH, Hinton GE, Sejnowski TJ (1985) A learning algorithm for Boltzmann machines. Cognitive Science 9(1):147–169
Article Google Scholar
Biamonte JD (2008) Nonperturbative k-body to two-body commuting conversion Hamiltonians and embedding problem instances into Ising spins. Physical Review A 77(5):052331
Article MathSciNet Google Scholar
Babbush R, O’Gorman B, Aspuru-Guzik A (2013) Resource efficient gadgets for compiling adiabatic quantum optimization problems. Annalen der Physik 525(10–11):877–888
Article MathSciNet Google Scholar
Borders WA, Pervaiz AZ, Fukami S, Camsari KY, Ohno H, Datta S (2019) Integer factorization using stochastic magnetic tunnel junctions. Nature 573(7774):390–393
Article Google Scholar
Benedetti M, Realpe-Gómez J, Biswas R, Perdomo-Ortiz A (2017) Quantum-assisted learning of hardware-embedded probabilistic graphical models. Physical Review X 7(4):041052
Article Google Scholar
Courville A, Bergstra J, Bengio Y (2011) A spike and slab restricted Boltzmann machine. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, pp 233–241
Cipra BA (1987) An introduction to the Ising model. The American Mathematical Monthly 94(10):937–959
Article MathSciNet Google Scholar
Carreira-Perpinan MA, Hinton GE (2005) On contrastive divergence learning. In: Aistats, vol 10. Citeseer, pp 33–40
Carleo G, Troyer M (2017) Solving the quantum many-body problem with artificial neural networks. Science 355(6325):602–606
Article MathSciNet Google Scholar
Du Y, Lin T, Mordatch I (2019) Model based planning with energy based models. arXiv:1909.06878
Du Y, Mordatch I (2019) Implicit generation and generalization in energy-based models
Dahl G, Ranzato MA, Mohamed A-R, Hinton GE (2010) Phone recognition with the mean-covariance restricted Boltzmann machine. In: Advances in neural information processing systems. pp 469–477
Earl DJ, Deem MW (2005) Parallel tempering: theory, applications, and new perspectives. Phys Chem Chem Phys 7:3910–3916
Article Google Scholar
Finn C, Christiano P, Abbeel P, Levine S (2016) A connection between generative adversarial networks, inverse reinforcement learning, and energy-based models. arXiv:1611.03852
Gao X, Duan L-M (2017) Efficient representation of quantum many-body states with deep neural networks. Nature Communications 8(1):662
Article Google Scholar
Goldstein H (2002) Classical mechanics. Pearson Education
Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications
Hamze F, de Freitas N (2012) From fields to trees. arXiv:1207.4149
Huembeli P, Dauphin A, Wittek P, Gogolin C (2019) Automated discovery of characteristic features of phase transitions in many-body localization. Physical Review B 99(10):104106
Article Google Scholar
Hen I (2017) Solving spin glasses with optimized trees of clustered spins. Phys Rev E 96:022105
Article MathSciNet Google Scholar
Hopfield JJ, Feinstein DI, Palmer RG (1983) Unlearning has a stabilizing effect in collective memories. Nature 304(5922):158
Article Google Scholar
Hartnett GS, Mohseni M (2020) Self-supervised learning of generative spin-glasses with normalizing flows
Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences 79(8):2554–2558
Article MathSciNet Google Scholar
Houdayer J (2001) A cluster Monte Carlo algorithm for 2-dimensional spin glasses. The European Physical Journal B-Condensed Matter and Complex Systems 22(4):479–484
Article Google Scholar
Hsieh CY, Sun Q, Zhang S, Lee CK (2021) Unitary-coupled restricted boltzmann machine ansatz for quantum simulations. NPJ Quantum Information 7(1):1–10
Article Google Scholar
Haarnoja T, Tang H, Abbeel P, Levine S (2017) Reinforcement learning with deep energy-based policies. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70. pp 1352–1361, JMLR. org
Iten R, Metger T, Wilming H, Del Rio L, Renner R (2018) Discovering physical concepts with neural networks. arXiv:1807.10300
Jaynes ET (1957) Information theory and statistical mechanics. Physical Review 106(4):620
Article MathSciNet Google Scholar
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680
Article MathSciNet Google Scholar
Khoshaman A, Vinci W, Denis B, Andriyash E, Sadeghi H, Amin MH (2018) Quantum variational autoencoder. Quantum Science and Technology 4(1):014001
Article Google Scholar
Kieferová M, Wiebe N (2017) Tomography and generative training with quantum Boltzmann machines. Physical Review A 96(6):062327
Article Google Scholar
LeCun Y, Chopra S, Hadsell R (2006) A tutorial on energy-based learning
Le Roux N, Bengio Y (2008) Representational power of restricted Boltzmann machines and deep belief networks. Neural Computation 20(6):1631–1649
Article MathSciNet Google Scholar
Le Roux N, Bengio Y (2010) Deep belief networks are compact universal approximators. Neural Computation 22(8):2192–2207
Article MathSciNet Google Scholar
Liu J-G, Wang L (2018) Differentiable learning of quantum circuit born machines. Physical Review A 98(6):062324
Article MathSciNet Google Scholar
Li X, Wang B, Liu Y, Lee TS (2013) Learning discriminative sufficient statistics score space for classification. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, pp 49–64
Melko RG, Carleo G, Carrasquilla J, Cirac JI (2019) Restricted Boltzmann machines in quantum physics. Nature Physics 15(9):887–892
Article Google Scholar
Mezard M, Montanari A (2009) Information, physics, and computation. Oxford University Press Inc, New York
Book Google Scholar
Moore C, Mertens S (2011) The nature of computation. Oxford University Press Inc, New York
Book Google Scholar
Melnikov AA, Nautrup HP, Krenn M, Dunjko V, Tiersch M, Zeilinger A, Briegel HJ (2018) Active learning machine learns to create new quantum experiments. Proceedings of the National Academy of Sciences 115(6):1221–1226
Article Google Scholar
Mohseni M (2021) Article in preparation
Neyshabur B, Bhojanapalli S, McAllester D, Srebro N (2017) Exploring generalization in deep learning. In: Advances in neural information processing systems. pp 5947–5956
Nielsen MA, Chuang I (2002) Quantum computation and quantum information
Nijkamp E, Hill M, Han T, Zhu S-C, Wu YN (2019) On the anatomy of mcmc-based maximum likelihood learning of energy-based models
Robert CP, Casella G (1999) The Metropolis—Hastings algorithm. In: Monte Carlo statistical methods. Springer, pp 231–283
Rojas R (1996) Neural networks: a systematic introduction. Springer-Verlag, Berlin, Heidelberg
Book Google Scholar
Rojas R (2013) Neural networks: a systematic introduction. Springer Science & Business Media, New York
MATH Google Scholar
Swersky K, Buchman D, Freitas ND, Marlin BM, et al (2011) On autoencoders and score matching for energy based models. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11). pp 1201–1208
Selby A (2014) Efficient subgraph-based sampling of Ising-type models with frustration. arXiv:1409.3934
Sutton B, Faria R, Ghantasala LA, Jaiswal R, Camsari KY, Datta S (2020) Autonomous probabilistic coprocessing with petaflips per second. IEEE Access 8:157238–157252
Article Google Scholar
Salakhutdinov R, Mnih A, Hinton G (2007) Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th International Conference on Machine Learning Pages. ACM, pp 791–798
Stein DL, Newman CM (2013) Spin glasses and complexity. Princeton University Press, Princeton
Book Google Scholar
Swendsen RH, Wang J-S (1987) Nonuniversal critical dynamics in Monte Carlo simulations. Phys Rev Lett 58:86–88
Article Google Scholar
Torlai G, Mazzola G, Carrasquilla J, Troyer M, Melko R, Carleo G (2018) Neural-network quantum state tomography. Nature Physics 14(5):447
Article Google Scholar
van Hemmen JL (1986) Spin-glass models of a neural network. Physical Review A 34(4):3435–3445
Article MathSciNet Google Scholar
Verdon G, Marks J, Nanda S, Leichenauer S, Hidary J (2019) Quantum Hamiltonian-based models and the variational quantum thermalizer algorithm. arXiv:1910.02071
Wetzel SJ (2017) Unsupervised learning of phase transitions: from principal component analysis to variational autoencoders. Physical Review E 96(2):022140
Article Google Scholar
Wolff U (1989) Collective Monte Carlo updating for spin systems. Phys Rev Lett 62:361–364
Article Google Scholar
Zhai S, Cheng Y, Lu W, Zhang Z (2016) Deep structured energy based models for anomaly detection. arXiv:1605.07717
Zhao, J, Mathieu M, LeCun Y (2016) Energy-based generative adversarial network. arXiv:1609.03126
Zhang J, Wang H, Chu J, Huang S, Li T, Zhao Q (2019) Improved Gaussian-Bernoulli restricted Boltzmann machine for learning discriminative representations. Knowledge-Based Systems 185:104911
Article Google Scholar

Download references

Author information

Authors and Affiliations

École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
Patrick Huembeli
Xanadu, Toronto, Canada
Juan Miguel Arrazola & Nathan Killoran
Google AI Quantum, Venice, USA
Masoud Mohseni
University of Toronto, Toronto, Canada
Peter Wittek
Creative Destruction Lab, Toronto, Canada
Peter Wittek
Vector Institute for Artificial Intelligence, Toronto, Canada
Peter Wittek
Perimeter Institute for Theoretical Physics, Waterloo, Canada
Peter Wittek

Authors

Patrick Huembeli
View author publications
You can also search for this author in PubMed Google Scholar
Juan Miguel Arrazola
View author publications
You can also search for this author in PubMed Google Scholar
Nathan Killoran
View author publications
You can also search for this author in PubMed Google Scholar
Masoud Mohseni
View author publications
You can also search for this author in PubMed Google Scholar
Peter Wittek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Patrick Huembeli.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huembeli, P., Arrazola, J.M., Killoran, N. et al. The physics of energy-based models. Quantum Mach. Intell. 4, 1 (2022). https://doi.org/10.1007/s42484-021-00057-7

Download citation

Received: 06 August 2021
Accepted: 30 November 2021
Published: 06 January 2022
DOI: https://doi.org/10.1007/s42484-021-00057-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The physics of energy-based models

Abstract

Access this article

Similar content being viewed by others

Advice to a Young Mathematical Biologist

Recent advances in decision trees: an updated survey

Introduction to Reinforcement Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The physics of energy-based models

Abstract

Access this article

Similar content being viewed by others

Advice to a Young Mathematical Biologist

Recent advances in decision trees: an updated survey

Introduction to Reinforcement Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation