A Practical Guide to Training Restricted Boltzmann Machines

  • Geoffrey E. Hinton
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7700)


Restricted Boltzmann machines (RBMs) have been used as generative models of many different types of data. RBMs are usually trained using the contrastive divergence learning procedure. This requires a certain amount of practical experience to decide how to set the values of numerical meta-parameters. Over the last few years, the machine learning group at the University of Toronto has acquired considerable expertise at training RBMs and this guide is an attempt to share this expertise with other machine learning researchers.


Learning Rate Reconstruction Error Hide Unit Restrict Boltzmann Machine Training Case 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Carreira-Perpignan, M.A., Hinton, G.E.: On contrastive divergence learning. In: Artificial Intelligence and Statistics (2005)Google Scholar
  2. 2.
    Freund, Y., Haussler, D.: Unsupervised learning of distributions on binary vectors using two layer networks. In: Advances in Neural Information Processing Systems 4, pp. 912–919. Morgan Kaufmann, San Mateo (1992)Google Scholar
  3. 3.
    Ghahramani, Z., Hinton, G.: The EM algorithm for mixtures of factor analyzers. Technical Report CRG-TR-96-1, University of Toronto (May 1996)Google Scholar
  4. 4.
    Hinton, G.E.: Relaxation and its role in vision. PhD Thesis (1978)Google Scholar
  5. 5.
    Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Computation 14(8), 1711–1800 (2002)CrossRefzbMATHGoogle Scholar
  6. 6.
    Hinton, G.E.: To recognize shapes, first learn to generate images. In: Computational Neuroscience: Theoretical Insights into Brain Function (2007)Google Scholar
  7. 7.
    Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18(7), 1527–1554 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Hinton, G.E., Osindero, S., Welling, M., Teh, Y.: Unsupervised discovery of non-linear structure using contrastive backpropagation. Cognitive Science 30, 725–731 (2006b)CrossRefGoogle Scholar
  9. 9.
    Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences 79, 2554–2558 (1982)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Marks, T.K., Movellan, J.R.: Diffusion networks, product of experts, and factor analysis. In: Proc. Int. Conf. on Independent Component Analysis, pp. 481–485 (2001)Google Scholar
  11. 11.
    Mohamed, A.R., Hinton, G.E.: Phone recognition using restricted boltzmann machines. In: ICASSP 2010 (2010)Google Scholar
  12. 12.
    Mohamed, A.R., Dahl, G., Hinton, G.E.: Deep belief networks for phone recognition. In: NIPS 22 Workshop on Deep Learning for Speech Recognition (2009)Google Scholar
  13. 13.
    Nair, V., Hinton, G.E.: 3-d object recognition with deep belief nets. In: Advances in Neural Information Processing Systems, vol. 22, pp. 1339–1347 (2009)Google Scholar
  14. 14.
    Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proc. 27th International Conference on Machine Learning (2010)Google Scholar
  15. 15.
    Salakhutdinov, R.R., Hinton, G.E.: Replicated softmax: An undirected topic model. In: Advances in Neural Information Processing Systems, vol. 22 (2009)Google Scholar
  16. 16.
    Salakhutdinov, R.R., Murray, I.: On the quantitative analysis of deep belief networks. In: Proceedings of the International Conference on Machine Learning, vol. 25, pp. 872–879 (2008)Google Scholar
  17. 17.
    Salakhutdinov, R.R., Mnih, A., Hinton, G.E.: Restricted Boltzmann machines for collaborative filtering. In: Ghahramani, Z. (ed.) Proceedings of the International Conference on Machine Learning, vol. 24, pp. 791–798. ACM (2007)Google Scholar
  18. 18.
    Smolensky, P.: Information processing in dynamical systems: Foundations of harmony theory. In: Rumelhart, D.E., McClelland, J.L. (eds.) Parallel Distributed Processing, vol. 1, ch. 6, pp. 194–281. MIT Press, Cambridge (1986)Google Scholar
  19. 19.
    Sutskever, I., Tieleman: On the convergence properties of contrastive divergence. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), Sardinia, Italy (2010)Google Scholar
  20. 20.
    Taylor, G., Hinton, G.E., Roweis, S.T.: Modeling human motion using binary latent variables. In: Advances in Neural Information Processing Systems. MIT Press (2006)Google Scholar
  21. 21.
    Teh, Y.W., Hinton, G.E.: Rate-coded restricted Boltzmann machines for face recognition. In: Advances in Neural Information Processing Systems, vol. 13, pp. 908–914 (2001)Google Scholar
  22. 22.
    Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the Twenty-first International Conference on Machine Learning (ICML 2008). ACM (2008)Google Scholar
  23. 23.
    Tieleman, T., Hinton, G.E.: Using fast weights to improve persistent contrastive divergence. In: Proceedings of the 26th International Conference on Machine Learning, pp. 1033–1040. ACM, New York (2009)Google Scholar
  24. 24.
    Welling, M., Rosen-Zvi, M., Hinton, G.E.: Exponential family harmoniums with an application to information retrieval. In: Advances in Neural Information Processing Systems, pp. 1481–1488. MIT Press, Cambridge (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Geoffrey E. Hinton
    • 1
  1. 1.Department of Computer ScienceUniversity of TorontoTorontoCanada

Personalised recommendations