ICANN 2011: Artificial Neural Networks and Machine Learning – ICANN 2011 pp 10-17 | Cite as
Improved Learning of Gaussian-Bernoulli Restricted Boltzmann Machines
Abstract
We propose a few remedies to improve training of Gaussian-Bernoulli restricted Boltzmann machines (GBRBM), which is known to be difficult. Firstly, we use a different parameterization of the energy function, which allows for more intuitive interpretation of the parameters and facilitates learning. Secondly, we propose parallel tempering learning for GBRBM. Lastly, we use an adaptive learning rate which is selected automatically in order to stabilize training. Our extensive experiments show that the proposed improvements indeed remove most of the difficulties encountered when training GBRBMs using conventional methods.
Keywords
Restricted Boltzmann Machine Gaussian-Bernoulli Restricted Boltzmann Machine Adaptive Learning Rate Parallel TemperingPreview
Unable to display preview. Download preview PDF.
References
- 1.Ackley, D.H., Hinton, G.E., Sejnowski, T.J.: A learning algorithm for Boltzmann machines. Cognitive Science 9, 147–169 (1985)CrossRefGoogle Scholar
- 2.Cho, K.: Improved Learning Algorithms for Restricted Boltzmann Machines. Master’s thesis, Aalto University School of Science (2011)Google Scholar
- 3.Cho, K., Raiko, T., Ilin, A.: Parallel tempering is efficient for learning restricted boltzmann machines. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN 2010), Barcelona, Spain (July 2010)Google Scholar
- 4.Coates, A., Lee, H., Ng, A.Y.: An Analysis of Single-Layer Networks in Unsupervised Feature Learning. In: NIPS 2010 Workshop on Deep Learning and Unsupervised Feature Learning (2010)Google Scholar
- 5.Desjardins, G., Courville, A., Bengio, Y.: Adaptive Parallel Tempering for Stochastic Maximum Likelihood Learning of RBMs. In: NIPS 2010 Workshop on Deep Learning and Unsupervised Feature Learning (2010)Google Scholar
- 6.Desjardins, G., Courville, A., Bengio, Y., Vincent, P., Delalleau, O.: Parallel Tempering for Training of Restricted Boltzmann Machines. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 145–152 (2010)Google Scholar
- 7.Fischer, A., Igel, C.: Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) ICANN 2010. LNCS, vol. 6354, pp. 208–217. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 8.Hinton, G.E., Salakhutdinov, R.R.: Reducing the Dimensionality of Data with Neural Networks. Science 313(5786), 504–507 (2006)CrossRefMATHMathSciNetGoogle Scholar
- 9.Hinton, G.: A Practical Guide to Training Restricted Boltzmann Machines. Tech. Rep. Department of Computer Science, University of Toronto (2010)Google Scholar
- 10.Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis, 1st edn. Wiley Interscience, Hoboken (2001)CrossRefGoogle Scholar
- 11.Krizhevsky, A.: Learning multiple layers of features from tiny images. Tech. Rep. Computer Science Department, University of Toronto (2009)Google Scholar
- 12.Krizhevsky, A.: Convolutional Deep Belief Networks on CIFAR-2010. Tech. Rep. Computer Science Department, University of Toronto (2010)Google Scholar
- 13.MIT Center For Biological and Computation Learning: CBCL Face Database #1, http://www.ai.mit.edu/projects/cbcl
- 14.Ranzato, M.A., Hinton, G.E.: Modeling pixel means and covariances using factorized third-order Boltzmann machines. In: CVPR, pp. 2551–2558 (2010)Google Scholar
- 15.Salakhutdinov, R.: Learning Deep Generative Models. Ph.D. thesis, University of Toronto (2009)Google Scholar
- 16.Schulz, H., Müller, A., Behnke, S.: Investigating Convergence of Restricted Boltzmann Machine Learning. In: NIPS 2010 Workshop on Deep Learning and Unsupervised Feature Learning (2010)Google Scholar
- 17.Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. In: Parallel Distributed processing: Explorations in the Microstructure of Cognition, Foundations, vol. 1, USA, pp. 194–281. MIT Press, Cambridge (1986)Google Scholar
- 18.Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 1064–1071. ACM Press, New York (2008)CrossRefGoogle Scholar