Advertisement

Optimizing restricted Boltzmann machine learning by injecting Gaussian noise to likelihood gradient approximation

  • Prima Sanjaya
  • Dae-Ki KangEmail author
Article
  • 17 Downloads

Abstract

Restricted Boltzmann machines (RBMs) can be trained by applying stochastic gradient ascent to the objective function as the maximum likelihood learning. However, it is a difficult task due to the intractability of marginalization function gradient. Several methodologies have been proposed by adopting Gibbs Markov chain to approximate this intractability including Contrastive Divergence, Persistent Contrastive Divergence, and Fast Contrastive Divergence. In this paper, we propose an optimization which is injecting noise to underlying Monte Carlo estimation. We introduce two novel learning algorithms. They are Noisy Persistent Contrastive Divergence (NPCD), and further Fast Noisy Persistent Contrastive Divergence (FNPCD). We prove that the NPCD and FNPCD algorithms benefit on the average to equilibrium state with satisfactory condition. We have performed empirical investigation of diverse CD-based approaches and found that our proposed methods frequently obtain higher classification performance than traditional approaches on several benchmark tasks in standard image classification tasks such as MNIST, basic, and rotation datasets.

Keywords

Restricted Boltzmann machine Deep belief network Optimization Regularization Markov Chain Monte Carlo 

Notes

References

  1. 1.
    Carreira-Perpinan MA, Hinton GE (2005) On contrastive divergence learning. In: Aistats, vol 10, pp 33–40Google Scholar
  2. 2.
    Cho K, Ilin A, Raiko T (2011) Improved learning of gaussian-bernoulli restricted Boltzmann machines. Artificial Neural Networks and Machine Learning–ICANN 2011Google Scholar
  3. 3.
    Cho K, Raiko T, Ilin A (2010) Parallel tempering is efficient for learning restricted Boltzmann machines. In: The 2010 international joint conference on neural networks (ijcnn), pp 1–8. IEEEGoogle Scholar
  4. 4.
    Fischer A, Igel C (2012) An introduction to restricted Boltzmann machines. Progress in Pattern Recognition, Image Analysis, Computer Vision, and ApplicationsGoogle Scholar
  5. 5.
    Franzke B, Kosko B (2015) Using noise to speed up markov chain monte carlo estimation. Procedia Computer Science 53:113–120CrossRefGoogle Scholar
  6. 6.
    Hinton G (2010) A practical guide to training restricted Boltzmann machines. Momentum 9(1):926Google Scholar
  7. 7.
    Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Krogh A, Hertz JA (1992) A simple weight decay can improve generalization. In: Advances in neural information processing systems, pp 950–957Google Scholar
  10. 10.
    Lee H, Ekanadham C, Ng AY (2008) Sparse deep belief net model for visual area v2. In: Advances in neural information processing systems, pp 873–880Google Scholar
  11. 11.
    Lee H, Battle A, Raina R, Ng AY (2007) Efficient sparse coding algorithms. In: Advances in neural information processing systems, pp 801–808Google Scholar
  12. 12.
    Merino ER, Castrillejo FM, Pin JD, Prats DB (2018) Weighted contrastive divergence. arXiv:180102567
  13. 13.
    Salakhutdinov R, Hinton G (2009) Deep Boltzmann machines. In: Artificial intelligence and statistics, pp 448–455Google Scholar
  14. 14.
    Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th international conference on machine learning, pp 1064–1071. ACMGoogle Scholar
  15. 15.
    Tieleman T, Hinton G (2009) Using fast weights to improve persistent contrastive divergence. In: Proceedings of the 26th annual international conference on machine learning, pp 1033–1040. ACMGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Research and Development, Medical Ip. 806-809, Cancer Research InstituteSeoul National University HospitalSeoulRepublic of Korea
  2. 2.Department of Computer EngineeringDongseo UniversityBusanRepublic of Korea

Personalised recommendations