Skip to main content

Restricted Boltzmann Machine and Deep Belief Network

  • Chapter
  • First Online:
Elements of Dimensionality Reduction and Manifold Learning

Abstract

Centuries ago, the Boltzmann distribution, also called the Gibbs distribution, was proposed. This energy-based distribution was found to be useful for statistically modelling physical systems. One of these systems was the Ising model, which modelled interacting particles with binary spins. Later, it was discovered that the Ising model could be a neural network. Therefore, the Hopfield network was proposed, which implemented an Ising model in a network for modelling memory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A greedy algorithm makes every decision based on the most benefit at the current step and does not consider the final outcome at the final step. This greedy approach hopes that the final step will obtain a good result by small best steps based on their current benefits.

  2. 2.

    Simulated annealing is a metaheuristic optimization algorithm in which a temperature parameter controls the amount of global search versus local search. This gradually reduces the temperature to decrease the exploration and increase the exploitation of the search space.

  3. 3.

    In hashing, a hash function is used to map data of arbitrary size to fixed-size values.

References

  1. David H Ackley, Geoffrey E Hinton, and Terrence J Sejnowski. “A learning algorithm for Boltzmann machines”. In: Cognitive science 9.1 (1985), pp. 147–169.

    Google Scholar 

  2. Diego Alberici, Pierluigi Contucci, and Emanuele Mingione. “Deep Boltzmann machines: rigorous results at arbitrary depth”. In: Annales Henri Poincaré. Springer. 2021, pp. 1–24.

    Google Scholar 

  3. Diego Alberici et al. “Annealing and replica-symmetry in deep Boltzmann machines”. In: Journal of Statistical Physics 180.1 (2020), pp. 665–677.

    Google Scholar 

  4. Yoshua Bengio et al. “Greedy layer-wise training of deep networks”. In: Advances in neural information processing systems. 2007, pp. 153–160.

    Google Scholar 

  5. Christopher M Bishop. “Pattern recognition”. In: Machine learning 128.9 (2006).

    Google Scholar 

  6. Ludwig Boltzmann. “Studien uber das Gleichgewicht der lebenden Kraft”. In: Wissenschafiliche Abhandlungen 1 (1868), pp. 49–96.

    Google Scholar 

  7. Bernhard E Boser, Isabelle M Guyon, and Vladimir N Vapnik. “A training algorithm for optimal margin classifiers”. In: Proceedings of the fifth annual workshop on Computational learning theory. 1992, pp. 144–152.

    Google Scholar 

  8. Stephen G Brush. “History of the Lenz-Ising model”. In: Reviews of modern physics 39.4 (1967), p. 883.

    Google Scholar 

  9. Sean Carroll. From eternity to here: the quest for the ultimate theory of time. Penguin, 2010.

    Google Scholar 

  10. Peter Dayan et al. “The Helmholtz machine”. In: Neural computation 7.5 (1995), pp. 889–904.

    Google Scholar 

  11. Raaz Dwivedi et al. “Log-concave sampling: Metropolis-Hastings algorithms are fast!” In: Conference on learning theory. PMLR. 2018, pp. 793–797.

    Google Scholar 

  12. Carol Bates Edwards. Multivariate and multiple Poisson distributions. Iowa State University, 1962.

    Google Scholar 

  13. Stuart Geman and Donald Geman. “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images”. In: IEEE Transactions on pattern analysis and machine intelligence. PAMI-6.6 (1984), pp. 721–741.

    Google Scholar 

  14. Benyamin Ghojogh and Mark Crowley. “The theory behind overfitting, cross validation, regularization, bagging, and boosting: tutorial”. In: arXiv preprint arXiv:1905.12787 (2019).

    Google Scholar 

  15. Benyamin Ghojogh et al. “Sampling algorithms, from survey sampling to Monte Carlo methods: Tutorial and literature review”. In: arXiv preprint arXiv:2011.00901 (2020).

    Google Scholar 

  16. J Willard Gibbs. Elementary principles in statistical mechanics. Courier Corporation, 1902.

    Google Scholar 

  17. Xavier Glorot, Antoine Bordes, and Yoshua Bengio. “Deep sparse rectifier neural networks”. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings. 2011, pp. 315–323.

    Google Scholar 

  18. Ian Goodfellow et al. “Multi-prediction deep Boltzmann machines”. In: Advances in Neural Information Processing Systems 26 (2013), pp. 548–556.

    Google Scholar 

  19. Donald Hebb. The Organization of Behavior. Wiley & Sons, New York, 1949.

    Google Scholar 

  20. Geoffrey E Hinton. “A practical guide to training restricted Boltzmann machines”. In: Neural networks: Tricks of the trade. Springer, 2012, pp. 599–619.

    Google Scholar 

  21. Geoffrey E Hinton. “Boltzmann machine”. In: Scholarpedia 2.5 (2007), p. 1668.

    Google Scholar 

  22. Geoffrey E Hinton. “Deep belief networks”. In: Scholarpedia 4.5 (2009), p. 5947.

    Google Scholar 

  23. Geoffrey E Hinton. “Training products of experts by minimizing contrastive divergence”. In: Neural computation 14.8 (2002), pp. 1771–1800.

    Google Scholar 

  24. Geoffrey E Hinton, Simon Osindero, and Yee-Whye Teh. “A fast learning algorithm for deep belief nets”. In: Neural computation 18.7 (2006), pp. 1527–1554.

    Google Scholar 

  25. Geoffrey E Hinton and Ruslan R Salakhutdinov. “Reducing the dimensionality of data with neural networks”. In: Science 313.5786 (2006), pp. 504–507.

    Google Scholar 

  26. Geoffrey E Hinton and Russ R Salakhutdinov. “A better way to pretrain deep Boltzmann machines”. In: Advances in Neural Information Processing Systems 25 (2012), pp. 2447–2455.

    Google Scholar 

  27. Geoffrey E Hinton and Terrence J Sejnowski. “Optimal perceptual inference”. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. Vol. 448. IEEE, 1983.

    Google Scholar 

  28. John J Hopfield. “Neural networks and physical systems with emergent collective computational abilities”. In: Proceedings of the national academy of sciences 79.8 (1982), pp. 2554–2558.

    Google Scholar 

  29. John J Hopfield. “Neurons with graded response have collective computational properties like those of two-state neurons”. In: Proceedings of the national academy of sciences 81.10 (1984), pp. 3088–3092.

    Google Scholar 

  30. Kerson Huang. Statistical Mechanics. John Wiley & Sons, 1987.

    MATH  Google Scholar 

  31. Ernst Ising. “Beitrag zur theorie des ferromagnetismus”. In: Zeitschrift für Physik 31.1 (1925), pp. 253–258.

    Google Scholar 

  32. Andrej Karpathy and Li Fei-Fei. “Deep visual-semantic alignments for generating image descriptions”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015, pp. 3128–3137.

    Google Scholar 

  33. Scott Kirkpatrick, C Daniel Gelatt, and Mario P Vecchi. “Optimization by simulated annealing”. In: science 220.4598 (1983), pp. 671–680.

    Google Scholar 

  34. Daphne Koller and Nir Friedman. Probabilistic graphical models: principles and techniques. MIT press, 2009.

    MATH  Google Scholar 

  35. Alex Krizhevsky and Geoff Hinton. “Convolutional deep belief networks on CIFAR-10”. In: Unpublished manuscript 40.7 (2010), pp. 1–9.

    Google Scholar 

  36. Dmitry Krotov. “Hierarchical Associative Memory”. In: arXiv preprint arXiv:2107.06446 (2021).

    Google Scholar 

  37. Dmitry Krotov and John Hopfield. “Large associative memory problem in neurobiology and machine learning”. In: International Conference on Learning Representations (ICLR). 2021.

    Google Scholar 

  38. Dmitry Krotov and John J Hopfield. “Dense associative memory for pattern recognition”. In: Advances in neural information processing systems 29 (2016), pp. 1172–1180.

    Google Scholar 

  39. Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. “Deep learning”. In: nature 521.7553 (2015), pp. 436–444.

    Google Scholar 

  40. Yann LeCun et al. “A tutorial on energy-based learning”. In: Predicting structured data 1 (2006).

    Google Scholar 

  41. Wilhelm Lenz. “Beitršge zum verstšndnis der magnetischen eigenschaften in festen kšrpern”. In: Physikalische Z 21 (1920), pp. 613–615.

    Google Scholar 

  42. William A Little. “The existence of persistent states in the brain”. In: Mathematical biosciences 19.1–2 (1974), pp. 101–120.

    Google Scholar 

  43. Jan Melchior, Asja Fischer, and Laurenz Wiskott. “How to center deep Boltzmann machines”. In: The Journal of Machine Learning Research 17.1 (2016), pp. 3387–3447.

    Google Scholar 

  44. Abdel-rahman Mohamed, George Dahl, Geoffrey Hinton, et al. “Deep belief networks for phone recognition”. In: Nips workshop on deep learning for speech recognition and related applications. Vol. 1. 9. Vancouver, Canada. 2009, p. 39.

    Google Scholar 

  45. Abdel-rahman Mohamed, George E Dahl, and Geoffrey Hinton. “Acoustic modeling using deep belief networks”. In: IEEE transactions on audio, speech, and language processing 20.1 (2011), pp. 14–22.

    Google Scholar 

  46. Abdel-rahman Mohamed and Geoffrey Hinton. “Phone recognition using restricted Boltzmann machines”. In: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE. 2010, pp. 4354–4357.

    Google Scholar 

  47. Mehdi Molkaraie. “Marginal Densities, Factor Graph Duality and High-Temperature Series Expansions”. In: International Conference on Artificial Intelligence and Statistics. 2020, pp. 256–265.

    Google Scholar 

  48. Mehdi Molkaraie. “The primal versus the dual Ising model”. In: 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE. 2017, pp. 53–60.

    Google Scholar 

  49. Grégoire Montavon and Klaus-Robert Müller. “Deep Boltzmann machines and the centering trick”. In: Neural networks: tricks of the trade. Springer, 2012, pp. 621–637.

    Google Scholar 

  50. Chi Nhan Duong et al. “Beyond principal components: Deep Boltzmann machines for face modeling”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, pp. 4786–4794.

    Google Scholar 

  51. Leandro Aparecido Passos and Joao Paulo Papa. “Temperature-based deep Boltzmann machines”. In: Neural Processing Letters 48.1 (2018), pp. 95–107.

    Google Scholar 

  52. Hubert Ramsauer et al. “Hopfield networks is all you need”. In: arXiv preprint arXiv:2008.02217 (2020).

    Google Scholar 

  53. David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. “Learning representations by back-propagating errors”. In: Nature 323.6088 (1986), pp. 533–536.

    Google Scholar 

  54. Ruslan Salakhutdinov. “Learning deep Boltzmann machines using adaptive MCMC”. In: Proceedings of the 27th International Conference on Machine Learning. 2010, pp. 943–950.

    Google Scholar 

  55. Ruslan Salakhutdinov and Geoffrey Hinton. “An efficient learning procedure for deep Boltzmann machines”. In: Neural computation 24.8 (2012), pp. 1967–2006.

    Google Scholar 

  56. Ruslan Salakhutdinov and Geoffrey Hinton. “Deep Boltzmann machines”. In: Artificial intelligence and statistics. PMLR. 2009, pp. 448–455.

    Google Scholar 

  57. Ruslan Salakhutdinov and Geoffrey Hinton. “Semantic hashing”. In: International Journal of Approximate Reasoning 50.7 (2009), pp. 969–978.

    Google Scholar 

  58. Ruslan Salakhutdinov and Hugo Larochelle. “Efficient learning of deep Boltzmann machines”. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings. 2010, pp. 693–700.

    Google Scholar 

  59. Nitish Srivastava and Ruslan Salakhutdinov. “Multimodal Learning with Deep Boltzmann Machines”. In: Advances in neural information processing systems. Vol. 1. 2012, p. 2.

    Google Scholar 

  60. Nitish Srivastava and Ruslan Salakhutdinov. “Multimodal learning with deep Boltzmann machines”. In: Journal of Machine Learning Research 15.1 (2014), pp. 2949–2980.

    Google Scholar 

  61. Nitish Srivastava, Ruslan R Salakhutdinov, and Geoffrey E Hinton. “Modeling documents with deep Boltzmann machines”. In: arXiv preprint arXiv:1309.6865 (2013).

    Google Scholar 

  62. Nitish Srivastava et al. “Dropout: a simple way to prevent neural networks from overfitting”. In: The journal of machine learning research 15.1 (2014), pp. 1929–1958.

    Google Scholar 

  63. Ilya Sutskever, Geoffrey E Hinton, and Graham W Taylor. “The recurrent temporal restricted Boltzmann machine”. In: Advances in neural information processing systems. 2009, pp. 1601–1608.

    Google Scholar 

  64. Graham W Taylor, Geoffrey E Hinton, and Sam T Roweis. “Modeling human motion using binary latent variables”. In: Advances in neural information processing systems. 2007, pp. 1345–1352.

    Google Scholar 

  65. Laurens Van Der Maaten. “Learning a parametric embedding by preserving local structure”. In: Artificial Intelligence and Statistics. 2009, pp. 384–391.

    Google Scholar 

  66. Max Welling, Michal Rosen-Zvi, and Geoffrey E Hinton. “Exponential Family Harmoniums with an Application to Information Retrieval.” In: Advances in neural information processing systems. Vol. 4. 2004, pp. 1481–1488.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ghojogh, B., Crowley, M., Karray, F., Ghodsi, A. (2023). Restricted Boltzmann Machine and Deep Belief Network. In: Elements of Dimensionality Reduction and Manifold Learning. Springer, Cham. https://doi.org/10.1007/978-3-031-10602-6_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-10602-6_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-10601-9

  • Online ISBN: 978-3-031-10602-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics