On CPU Performance Optimization of Restricted Boltzmann Machine and Convolutional RBM

  • Baptiste Wicht
  • Andreas Fischer
  • Jean Hennebert
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9896)


Although Graphics Processing Units (GPUs) seem to currently be the best platform to train machine learning models, most research laboratories are still only equipped with standard CPU systems. In this paper, we investigate multiple techniques to speedup the training of Restricted Boltzmann Machine (RBM) models and Convolutional RBM (CRBM) models on CPU with the Contrastive Divergence (CD) algorithm. Experimentally, we show that the proposed techniques can reduce the training time by up to 30 times for RBM and up to 12 times for CRBM, on a data set of handwritten digits.


Graphic Processing Unit Hide Unit Contrastive Divergence Stochastic Gradient Descent Restrict Boltzmann Machine 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bracewell, R.: The Fourier Transform and Its Applications. McGraw-Hill Electrical and Electronic Engineering Series, vol. 1. McGraw-Hill, New York (1965)zbMATHGoogle Scholar
  2. 2.
    Coates, A., Ng, A.Y., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 215–223 (2011)Google Scholar
  3. 3.
    Hinton, G.E., Sejnowski, T.J.: Learning and relearning in Boltzmann machines. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, pp. 282–317. MIT Press, Cambridge (1986).
  4. 4.
    Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1771–1800 (2002)CrossRefzbMATHGoogle Scholar
  5. 5.
    Hinton, G.E.: A practical guide to training restricted Boltzmann machines. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, 2nd edn, pp. 599–619. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  6. 6.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678. ACM (2014)Google Scholar
  8. 8.
    Krizhevsky, A., Hinton, G.: Convolutional Deep belief Networks on CIFAR-10. Unpublished manuscript 40 (2010)Google Scholar
  9. 9.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  10. 10.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  11. 11.
    Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the International Conference on Machine Learning, pp. 609–616. ACM (2009)Google Scholar
  12. 12.
    Lee, H., Pham, P., Largman, Y., Ng, A.Y.: Unsupervised feature learning for audio classification using CDBNs. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 1096–1104 (2009)Google Scholar
  13. 13.
    Lee, V.W., Kim, C., Chhugani, J., Deisher, M., Kim, D., Nguyen, A.D., Satish, N., Smelyanskiy, M., Chennupaty, S., Hammarlund, P., et al.: Debunking the 100x GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. In: ACM SIGARCH Computer Architecture News, vol. 38, pp. 451–460. ACM (2010)Google Scholar
  14. 14.
    Lopes, N., Ribeiro, B.: Towards adaptive learning with improved convergence of deep belief networks on graphics processing units. Pattern Recogn. 47(1), 114–127 (2014)CrossRefGoogle Scholar
  15. 15.
    Ly, D.L., Paprotski, V., Yen, D.: Neural networks on GPUs: Restricted Boltzmann Machines (2008).
  16. 16.
    Mathieu, M., Henaff, M., LeCun, Y.: Fast training of convolutional networks through FFTs (2013). arXiv preprint arXiv:1312.5851
  17. 17.
    Ngiam, J., Coates, A., Lahiri, A., Prochnow, B., Le, Q.V., Ng, A.Y.: On optimization methods for deep learning. In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 265–272 (2011)Google Scholar
  18. 18.
    Ren, J.S., Xu, L.: On vectorization of deep convolutional neural networks for vision tasks (2015). arXiv preprint arXiv:1501.07338
  19. 19.
    Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. Parallel Distrib. Process. 1, 194–281 (1986)Google Scholar
  20. 20.
    Upadhyaya, S.R.: Parallel approaches to machine learning: a comprehensive survey. J. Parallel Distrib. Comput. 73(3), 284–292 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Baptiste Wicht
    • 1
    • 2
  • Andreas Fischer
    • 1
    • 2
  • Jean Hennebert
    • 1
    • 2
  1. 1.University of Applied Science of Western SwitzerlandDelémontSwitzerland
  2. 2.University of FribourgFribourgSwitzerland

Personalised recommendations