Skip to main content

Deep Boltzmann Machines and the Centering Trick

  • Chapter

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 7700)

Abstract

Deep Boltzmann machines are in theory capable of learning efficient representations of seemingly complex data. Designing an algorithm that effectively learns the data representation can be subject to multiple difficulties. In this chapter, we present the “centering trick” that consists of rewriting the energy of the system as a function of centered states. The centering trick improves the conditioning of the underlying optimization problem and makes learning more stable, leading to models with better generative and discriminative properties.

Keywords

  • Deep Boltzmann machine
  • centering
  • reparameterization
  • unsupervised learning
  • optimization
  • representations

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-642-35289-8_33
  • Chapter length: 17 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-3-642-35289-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arnold, L., Auger, A., Hansen, N., Ollivier, Y.: Information-geometric optimization algorithms: A unifying picture via invariance principles, arXiv:1106.3708 (2011)

    Google Scholar 

  2. Braun, M.L., Buhmann, J., Müller, K.-R.: On relevant dimensions in kernel feature spaces. Journal of Machine Learning Research 9, 1875–1908 (2008)

    MathSciNet  MATH  Google Scholar 

  3. Cho, K., Raiko, T., Ilin, A.: Enhanced gradient and adaptive learning rate for training restricted Boltzmann machines. In: Proceedings of the 28th International Conference on Machine Learning, pp. 105–112 (2011)

    Google Scholar 

  4. Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Computation 14(8), 1771–1800 (2002)

    CrossRef  MATH  Google Scholar 

  5. Hinton, G.E.: A Practical Guide to Training Restricted Boltzmann Machines. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) NN: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 599–619. Springer, Heidelberg (2012)

    Google Scholar 

  6. Hinton, G.E., Sejnowski, T.J.: Learning and relearning in Boltzmann machines. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, pp. 282–317. MIT Press (1986)

    Google Scholar 

  7. LeCun, Y., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient BackProp. In: Orr, G.B., Müller, K.-R. (eds.) NIPS-WS 1996. LNCS, vol. 1524, pp. 9–50. Springer, Heidelberg (1998)

    CrossRef  Google Scholar 

  8. Montavon, G., Braun, M.L., Müller, K.-R.: Kernel analysis of deep networks. Journal of Machine Learning Research 12, 2563–2581 (2011)

    MathSciNet  MATH  Google Scholar 

  9. Montavon, G., Braun, M.L., Müller, K.-R.: Deep Boltzmann machines as feed-forward hierarchies. Journal of Machine Learning Research - Proceedings Track 22, 789–804 (2012)

    Google Scholar 

  10. Neal, R.M.: Annealed importance sampling. Statistics and Computing 11(2), 125–139 (2001)

    MathSciNet  CrossRef  Google Scholar 

  11. Pearlmutter, B.A.: Fast exact multiplication by the Hessian. Neural Computation 6(1), 147–160 (1994)

    CrossRef  Google Scholar 

  12. Salakhutdinov, R., Hinton, G.E.: Deep Boltzmann machines. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, vol. 5, pp. 448–455 (2009)

    Google Scholar 

  13. Salakhutdinov, R.: Learning and Evaluating Boltzmann Machines. Technical Report UTML TR 2008-002, Dept. of Computer Science, University of Toronto (2008)

    Google Scholar 

  14. Salakhutdinov, R., Murray, I.: On the quantitative analysis of deep belief networks. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 872–879 (2008)

    Google Scholar 

  15. Schölkopf, B., Smola, A., Müller, K.-R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10(5), 1299–1319 (1998)

    CrossRef  Google Scholar 

  16. Schölkopf, B., Mika, S., Burges, C.J.C., Knirsch, P., Müller, K.-R., Rätsch, G., Smola, A.J.: Input space versus feature space in kernel-based methods. IEEE Transactions on Neural Networks 10(5), 1000–1017 (1999)

    CrossRef  Google Scholar 

  17. Schraudolph, N.N.: Centering Neural Network Gradient Factors. In: Orr, G.B., Müller, K.-R. (eds.) NIPS-WS 1996. LNCS, vol. 1524, pp. 207–226. Springer, Heidelberg (1998)

    CrossRef  Google Scholar 

  18. Tang, Y., Sutskever, I.: Data normalization in the learning of restricted Boltzmann machines. Technical Report UTML-TR-11-2, Department of Computer Science, University of Toronto (2011)

    Google Scholar 

  19. Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1064–1071 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Montavon, G., Müller, KR. (2012). Deep Boltzmann Machines and the Centering Trick. In: Montavon, G., Orr, G.B., Müller, KR. (eds) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol 7700. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35289-8_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35289-8_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35288-1

  • Online ISBN: 978-3-642-35289-8

  • eBook Packages: Computer ScienceComputer Science (R0)