Adaptation of Deep Belief Networks to Modern Multicore Architectures

  • Tomasz Olas
  • Wojciech K. Mleczko
  • Robert K. Nowicki
  • Roman Wyrzykowski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9573)

Abstract

In our previous paper [17], the parallel realization of Restricted Boltzman Machines (RBMs) was discussed. This research confirmed a potential usefulness of Intel MIC parallel architecture for implementation of RBMs.

In this work, we investigate how the Intel MIC and Intel CPU architectures can be applied to implement the complete learning process using Deep Belief Networks (DBNs), which layers correspond to RBMs. The learning procedure is based on the matrix approach, where learning samples are grouped into packages, and represented as matrices. This approach is now applied for both the initial learning, and fine-tuning stages of learning. The influence of the package size on the accuracy of learning, as well as on the performance of computations are studied using conventional CPU and Intel Xeon Phi architectures.

Keywords

Deep belief network Restricted Boltzman machine Parallel programming Multicore architectures OpenMP Vectorization Intel Xeon Phi 

Notes

Acknowledgements

This project was supported by the National Centre for Research and Development under MICLAB project No. POIG.02.03.00.24-093/13, and by the Polish Ministry of Science and Education under Grant No. BS-1-112-304/99/S, as well as by the Polish National Science Centre under grant No. DEC-2012/05/B/ST6/03620.

The authors are grateful to the Czestochowa University of Technology for granting access to Intel CPU and Xeon Phi platforms providing by the MICLAB project.

References

  1. 1.
    Bilski, J., Litwiński, S., Smola̧g, J.: Parallel realisation of qr algorithm for neural networks learning. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS (LNAI), vol. 3070, pp. 158–165. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  2. 2.
    Bilski, J., Smola̧g, J.: Parallel realisation of the recurrent rtrn neural network learning. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2008. LNCS (LNAI), vol. 5097, pp. 11–16. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  3. 3.
    Bilski, J., Smola̧g, J.: Parallel realisation of the recurrent elman neural network learning. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2010, Part II. LNCS, vol. 6114, pp. 19–25. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  4. 4.
    Bilski, J., Smoląg, J.: Parallel realisation of the recurrent multi layer perceptron learning. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2012, Part I. LNCS, vol. 7267, pp. 12–20. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  5. 5.
    Bilski, J., Smoląg, J.: Parallel approach to learning of the recurrent jordan neural network. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2013, Part I. LNCS, vol. 7894, pp. 32–40. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  6. 6.
    Bilski, J., Smoląg, J., Galushkin, A.I.: The parallel approach to the conjugate gradient learning algorithm for the feedforward neural networks. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2014, Part I. LNCS, vol. 8467, pp. 12–21. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  7. 7.
    Chu, J.L., Krzyzak, A.: The recognition of partially occluded objects with support vector machines and convolutional neural networks and deep belief networks. J. Artif. Intell. Soft Comput. Res. 4(1), 5–19 (2014)CrossRefGoogle Scholar
  8. 8.
    Corporation, I.: Intel Xeon Phi Coprocessor System Software Developer’s Guide. The Intel Corporation, Technical report, June 2013Google Scholar
  9. 9.
    Dourlens, S., Ramdane-Cherif, A.: Modeling & understanding environment using semantic agents. J. Artif. Intell. Soft Comput. Res. 1(4), 301–314 (2011)Google Scholar
  10. 10.
    Fang, J., Varbanescu, A.L., Sips, H.: Benchmarking Intel Xeon Phi to Guide Kernel Design.Delft University of Technology Parallel and Distributed Systems Report Series, No. PDS-2013-005, pp. 1–22 (2013)Google Scholar
  11. 11.
    Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Hinton, G.: A practical guide to training restricted Boltzmann machines. Momentum 9(1), 926 (2010)Google Scholar
  13. 13.
    Hinton, G., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    http://miclab.pl: MICLAB pilot laboratory of massively parallel systems. Web Page (2015)
  15. 15.
    http://yann.lecun.com/exdb/mnist/: The mnist database of handwritten digits
  16. 16.
    Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1725–1732, June 2014Google Scholar
  17. 17.
    Olas, T., Mleczko, W.K., Nowicki, R.K., Wyrzykowski, R., Krzyzak, A.: Artificial intelligence and soft computing. In: Proceedings of the 14th International Conference, ICAISC 2015, Zakopane, Poland, 14–18 June 2015, Part I, chap. Adaptation of RBM Learning for Intel MIC Architecture, pp. 90–101. Springer International Publishing, Cham (2015)Google Scholar
  18. 18.
    Patan, K., Patan, M.: Optimal training strategies for locally recurrent neural networks. J. Artif. Intell. Soft Comput. Res. 1(2), 103–114 (2011)MATHGoogle Scholar
  19. 19.
    Reinders, J.: An Overview of Programming for Intel Xeon Processors and Intel Xeon Phi Coprocessors. Technical report (2012)Google Scholar
  20. 20.
    Rojek, K.A., Ciznicki, M., Rosa, B., Kopta, P., Kulczewski, M., Kurowski, K., Piotrowski, Z.P., Szustak, L., Wojcik, D.K., Wyrzykowski, R.: Adaptation of fluid model eulag to graphics processing unit architecture. Concurr. Comput.: Pract. Exper. 27(4), 937–957 (2015)CrossRefGoogle Scholar
  21. 21.
    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. arXiv preprint (2014). arXiv:1409.0575
  22. 22.
    Saule, E., Kaya, K., Catalyurek, U., Saule, E., Kaya, K., Catalyurek, U.: Performance evaluation of sparse matrix multiplication kernels on intel xeon phi. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds.) Parallel Processing and Applied Mathematics. LNCS, vol. 8384, pp. 559–570. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  23. 23.
    Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. In: Rumelhart, D.E., McLelland, J.L. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1 Fundations, chapter 6, pp. 194–281. MIT (1986)Google Scholar
  24. 24.
    Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory (1986)Google Scholar
  25. 25.
    Staff, C.I., Reinders, J.: Parallel Programming and Optimization with Intel\(\textregistered \) Xeon PhiTM Coprocessors: Handbook on the Development and Optimization of Parallel Aplications for Intel\(\textregistered \) Xeon Coprocessors and Intel\(\textregistered \) Xeon PhiTM Coprocessors. Colfax International (2013)Google Scholar
  26. 26.
    Szustak, L., Rojek, K., Gepner, P.: Using intel xeon phi coprocessor to accelerate computations in MPDATA algorithm. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds.) PPAM 2013, Part I. LNCS, vol. 8384, pp. 582–592. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  27. 27.
    Szustak, L., Rojek, K., Olas, T., Kuczynski, L., Halbiniak, K., Gepner, P.: Adaptation of MPDATA heterogeneous stencil computation to intel xeon phi coprocessor. Sci. Program 2015, 14 (2015)Google Scholar
  28. 28.
    Tambouratzis, T., Chernikova, D., Pázsit, I.: Pulse shape discrimination of neutrons and gamma rays using Kohonen artificial neural networks. J. Artif. Intell. Soft Comput. Res. 3(2), 77–88 (2013)CrossRefGoogle Scholar
  29. 29.
    Wyrzykowski, R., Szustak, L., Rojek, K.: Parallelization of 2d MPDATA EULAG algorithm on hybrid architectures with GPU accelerators. Parallel Comput. 40(8), 425–447 (2014)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Tomasz Olas
    • 1
  • Wojciech K. Mleczko
    • 2
  • Robert K. Nowicki
    • 2
  • Roman Wyrzykowski
    • 1
  1. 1.Institute of Computer and Information SciencesCzestochowa University of TechnologyCzestochowaPoland
  2. 2.Institute of Computational IntelligenceCzestochowa University of TechnologyCzestochowaPoland

Personalised recommendations