Skip to main content

Chainer-XP: A Flexible Framework for ANNs Run on the Intel® Xeon PhiTM Coprocessor

  • Conference paper
  • First Online:
Modeling, Simulation and Optimization of Complex Processes HPSC 2018

Abstract

Chainer is a well-known deep learning framework facilitating the quick and efficient establishment of Artificial Neural Networks. Chainer can be deployed on systems consisting of Central Processing Units and Graphics Processing Units efficiently. In addition, it is possible to run Chainer on systems containing Intel Xeon Phi coprocessors. Nonetheless, Chainer can only be deployed on Intel Xeon Phi Knights Landing, not Knights Corner. There are many existing systems, such as Tiane2 (MilkyWay-2), Thunder, Cascade, SuperMUC, and so on, including Knights Corner only. For that reason, Chainer cannot fully exploit the computing power of such systems, which leads to the demand for supporting Chainer run on them. It becomes more challenging in the situation where deep learning applications are written in Python while the Xeon Phi processor is only capable of interpreting C/C\(++\) or Fortran. Fortunately, there is an offloading module called pyMIC which helps port Python applications into the Intel Xeon Phi Knights Corner coprocessor. In this paper, we present Chainer-XP as a deep learning framework assisting applications to run on the systems containing the Intel Xeon Phi Knights Corner coprocessor. Chainer-XP is an extension of Chainer by integrating pyMIC into Chainer. The experimental findings show that Chainer-XP can help to move the core computation (matrix multiplication) to the Intel Xeon Phi Knights Corner coprocessor with acceptable performance in comparison with Chainer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Caffe. http://caffe.berkeleyvision.org. Accessed 29 May 2018

  2. Chainer. https://chainer.org. Accessed 29 May 2018

  3. Cython. http://cython.org. Accessed 29 May 2018

  4. Deep learning frameworks. http://www.numpy.org. Accessed 29 May 2018

  5. Docker. https://www.docker.com. Accessed 29 May 2018

  6. Numpy. http://www.numpy.org. Accessed 29 May 2018

  7. Scipy. https://www.scipy.org. Accessed 29 May 2018

  8. Thread affinity interface. https://software.intel.com/en-us/node/522691. Accessed 21 Dec 2017

  9. Top 500. https://www.top500.org. Accessed 29 May 2018

  10. Xeon-cafphi. http://rohithj.github.io/Xeon-CafPhi. Accessed 29 May 2018

  11. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., et al.: Tensorflow: large-scale machine learning on heterogeneous distributed systems (2016). arXiv preprint arXiv:1603.04467

  12. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: a system for large-scale machine learning. OSDI 16, 265–283 (2016)

    Google Scholar 

  13. Al-Rfou, R., Alain, G., Almahairi, A., Angermueller, C., Bahdanau, D., Ballas, N., Bastien, F., Bayer, J., Belikov, A., Belopolsky, A., et al.: Theano: a python framework for fast computation of mathematical expressions 472, 473 (2016). arXiv preprint arXiv:1605.02688

  14. Bottou, L.: Stochastic gradient descent tricks. In: Neural Networks: Tricks of the Trade, pp. 421–436. Springer, Berlin (2012)

    Google Scholar 

  15. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)

    Google Scholar 

  16. Ding, W., Wang, R., Mao, F., Taylor, G.: Theano-based large-scale visual recognition with multiple gpus (2014). arXiv preprint arXiv:1412.2302

  17. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)

    Google Scholar 

  18. Halyo, V., LeGresley, P., Lujan, P., Karpusenko, V., Vladimirov, A.: First evaluation of the CPU, GPGPU and MIC architectures for real time particle tracking based on hough transform at the LHC. J. Instrum. 9(04), P04005 (2014)

    Google Scholar 

  19. Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenger, R., Satheesh, S., Sengupta, S., Coates, A., et al.: Deep speech: Scaling up end-to-end speech recognition (2014). arXiv preprint arXiv:1412.5567

  20. Jones, E., Oliphant, T., Peterson, P.: \(\{\)SciPy\(\}\): open source scientific tools for \(\{\)Python\(\}\) (2014)

    Google Scholar 

  21. Kinga, D., Adam, J.B.: A method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  22. Klemm, M., Enkovaara, J.: pymic: a python offload module for the intel xeon phi coprocessor. In: Proceedings of PyHPC (2014)

    Google Scholar 

  23. Klemm, M., Witherden, F., Vincent, P.: Using the pymic offload module in pyfr (2016). arXiv preprint arXiv:1607.00844

  24. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  25. LeCun, Y.: The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/ (1998)

  26. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  27. Team, T.T.D., Al-Rfou, R., Alain, G., Almahairi, A., Angermueller, C., Bahdanau, D., Ballas, N., Bastien, F., Bayer, J., Belikov, A., et al.: Theano: a python framework for fast computation of mathematical expressions (2016). arXiv preprint arXiv:1605.02688

  28. Tokui, S., Oono, K., Hido, S., Clayton, J.: Chainer: a next-generation open source framework for deep learning. In: Proceedings of Workshop on Machine Learning Systems (LearningSys) in the 29th Annual Conference on Neural Information Processing Systems (NIPS), vol. 5 (2015)

    Google Scholar 

  29. Vision, B., Center, L.: Caffe: a deep learning framework (2015)

    Google Scholar 

  30. Vladimirov, A., Asai, R., Karpusenko, V.: Parallel Programming and Optimization with Intel Xeon Phi Coprocessors: Handbook on the Development and Optimization of Parallel Applications for Intel Xeon Processors and Intel Xeon Phi Coprocessors. Colfax International (2015)

    Google Scholar 

  31. Walt, S.V.D., Colbert, S.C., Varoquaux, G.: The numpy array: a structure for efficient numerical computation. Comput. Sci. Eng. 13(2), 22–30 (2011)

    Google Scholar 

Download references

Acknowledgements

This research was conducted within the “Studying Tools to Support Applications Running on Powerful Clusters & Big Data Analytics (HPDA phase I 2018-2020)” funded by Ho Chi Minh City Department of Science and Technology (under grant number 46/2018/HD-QKHCN). The authors would also like to thank the anonymous referees for their valuable comments and helpful suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thanh-Dang Diep .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Diep, TD. et al. (2021). Chainer-XP: A Flexible Framework for ANNs Run on the Intel® Xeon PhiTM Coprocessor. In: Bock, H.G., Jäger, W., Kostina, E., Phu, H.X. (eds) Modeling, Simulation and Optimization of Complex Processes HPSC 2018. Springer, Cham. https://doi.org/10.1007/978-3-030-55240-4_7

Download citation

Publish with us

Policies and ethics