Opencl-pytorch: an OpenCL-based extension of PyTorch

Sui, Yicheng; Sun, Yufei; Shi, Changqing; Wang, Haotian; Zhang, Zhiqiang; Wang, Jiahao; Zhang, Yuzhi

doi:10.1007/s42514-024-00186-y

Opencl-pytorch: an OpenCL-based extension of PyTorch

Regular Paper
Published: 08 April 2024

(2024)
Cite this article

CCF Transactions on High Performance Computing Aims and scope Submit manuscript

Yicheng Sui¹,
Yufei Sun^1,2,
Changqing Shi¹,
Haotian Wang¹,
Zhiqiang Zhang²,
Jiahao Wang¹ &
…
Yuzhi Zhang^1,2

51 Accesses
Explore all metrics

Abstract

Currently, most Deep Learning (DL) frameworks support only CUDA and ROCm environments, limiting their use to NVIDIA and AMD GPUs. Since current High-Performance Computing (HPC) usually uses different types of heterogeneous devices to accelerate computing, some HPCs cannot utilize heterogeneous devices for computing based on the DL frameworks. To address this problem, we introduce OpenCL-PyTorch, a PyTorch extension based on OpenCL. This extension enables the deployment of DL models on a broader range of OpenCL devices, encompassing CPUs, GPUs, and other accelerators. A standout feature of OpenCL-PyTorch is our novel unified OpenCL device and memory management approach, which significantly enhances performance. We rigorously evaluated OpenCL-PyTorch with various DL models, confirming its accuracy and effectiveness. The validation of the management approach further underscores the importance of our unified device and memory management in optimizing operator performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantitative evaluation of deep learning frameworks in heterogeneous computing environment

Article 08 September 2023

The Use of Deep Learning, Image Processing, and High-Performance Computing: A Systematic Mapping Study

The DeepHealth Toolkit: A Key European Free and Open-Source Software for Deep Learning and Computer Vision Ready to Exploit Heterogeneous HPC and Cloud Architectures

Data availability

This paper proposed an extension of PyTorch and the optimization solutions therein. In this paper, we did not explicitly use a dataset, so no data needs to be publicly available.

Notes

PyTorch Examples: https://github.com/pytorch/examples.

References

Abadi, M., Barham, P., Chen, J., et al.: \(\{\)TensorFlow\(\}\): a system for \(\{\)Large-Scale\(\}\) machine learning. In: 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp 265–283 (2016)
Beilis, A.: dlprimitives: Deep learning primitives and mini-framework for opencl (2023a). https://github.com/artyom-beilis/dlprimitives
Beilis, A.: pytorch_dlprim: Dlprimitives/opencl out of tree backend for pytorch (2023b). https://github.com/artyom-beilis/pytorch_dlprim
Chen, T., Li, M., Li, Y., et al.: Mxnet: a flexible and efficient machine learning library for heterogeneous distributed systems (2015). arXiv preprint arXiv:1512.01274
Gu, J., Liu, Y., Gao, Y., et al.: Opencl caffe: accelerating and enabling a cross platform machine learning framework. In: Proceedings of the 4th International Workshop on OpenCL, pp 1–5 (2016)
Harvey, M.J., De Fabritiis, G.: Swan: a tool for porting cuda programs to opencl. Comput. Phys. Commun. 182(4), 1093–1099 (2011)
Article Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., et al.: Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia, pp 675–678 (2014)
Jin, Z., Finkel, H.: Optimizing an atomics-based reduction kernel on opencl fpga platform. In: 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), IEEE, pp 532–539 (2018)
Kalyan, K.S., Rajasekharan, A., Sangeetha, S.: Ammus: A survey of transformer-based pretrained models in natural language processing (2021). arXiv preprint arXiv:2108.05542
Keryell, R., Reyes, R., Howes, L.: Khronos sycl for opencl: a tutorial. In: Proceedings of the 3rd International Workshop on OpenCL, pp 1–1 (2015)
Khan, J., Fultz, P., Tamazov, A., et al.: Miopen: An open source library for deep learning primitives (2019). arXiv preprint arXiv:1910.00078
Kingma, D.P., Welling, M.: Auto-encoding variational bayes (2013). arXiv preprint arXiv:1312.6114
Koo, Y., Kim, S., Yg, H.: Opencl-darknet: implementation and optimization of opencl-based deep learning object detection framework. World Wide Web 24, 1299–1319 (2021)
Article Google Scholar
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Handb. Syst. Autoim. Dis. 1(4) (2009)
Li, Z., Liu, F., Yang, W., et al.: A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. (2021)
Martinez, G., Gardner, M., Feng, Wc.: Cu2cl: A cuda-to-opencl translator for multi-and many-core architectures. In: 2011 IEEE 17th International Conference on Parallel and Distributed Systems, IEEE, pp 300–307 (2011)
McDonough, J.E., McDonough, J.E.: Adapter design pattern. In: A Practical Approach, Object-Oriented Design with ABAP, pp. 191–205 (2017)
Nguyen, G., Dlugolinsky, S., Bobák, M., et al.: Machine learning and deep learning frameworks and libraries for large-scale data mining: a survey. Artif. Intell. Rev. 52, 77–124 (2019)
Article Google Scholar
Nugteren, C.: Clblast: a tuned opencl blas library. In: Proceedings of the International Workshop on OpenCL. Association for Computing Machinery, New York, NY, USA, IWOCL ’18 (2018). https://doi.org/10.1145/3204919.3204924
Park, J., Yoon, H., Ahn, D., et al.: Optimus: optimized matrix multiplication structure for transformer neural network accelerator. Proc. Mach. Learn. Syst. 2, 363–378 (2020)
Google Scholar
Paszke, A., Gross, S., Massa, F., et al.: Pytorch: an imperative style, high-performance deep learning library. Adv. Neural Inform. Process. Syst. 32 (2019)
Pouyanfar, S., Sadiq, S., Yan, Y., et al.: A survey on deep learning: algorithms, techniques, and applications. ACM Comput. Surv. 51(5) (2018).https://doi.org/10.1145/3234150
Redmon, J.: Darknet: open source neural networks in c (2013–2016). http://pjreddie.com/darknet/
Reuther, A., Michaleas, P., Jones, M., et al.: Survey of machine learning accelerators. In: 2020 IEEE High Performance Extreme Computing Conference (HPEC), pp 1–1 (2020). https://doi.org/10.1109/HPEC43674.2020.9286149
Reuther, A., Michaleas, P., Jones, M., et al.: Ai accelerator survey and trends. In: 2021 IEEE High Performance Extreme Computing Conference (HPEC), pp 1–9 (2021). https://doi.org/10.1109/HPEC49654.2021.9622867
Ronan, C., Clement, F., Koray, K., et al.: Torch: a scientific computing framework for luajit. In: A Scientific Computing Framework for Luajit, Torch (2017)
Shi, W., Caballero, J., Huszár, F., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network (2016). arXiv:1609.05158
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.1556
Tarwani, K.M., Edem, S.: Survey on recurrent neural network in natural language processing. Int. J. Eng. Trends Technol. 48(6), 301–304 (2017)
Article Google Scholar
Yu, Y., Si, X., Hu, C., et al.: A review of recurrent neural networks: Lstm cells and network architectures. Neural Computat. 31(7), 1235–1270 (2019)
Article MathSciNet Google Scholar

Download references

Acknowledgements

This research is supported by National Key R &D Program of China grant 2021YFB0300104.

Author information

Authors and Affiliations

College of Software, Nankai University, Hongda, 300457, Tianjin, China
Yicheng Sui, Yufei Sun, Changqing Shi, Haotian Wang, Jiahao Wang & Yuzhi Zhang
Haihe Lab of ITAI, Hi-Tech Area, 300480, Tianjin, China
Yufei Sun, Zhiqiang Zhang & Yuzhi Zhang

Authors

Yicheng Sui
View author publications
You can also search for this author in PubMed Google Scholar
Yufei Sun
View author publications
You can also search for this author in PubMed Google Scholar
Changqing Shi
View author publications
You can also search for this author in PubMed Google Scholar
Haotian Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiahao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuzhi Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yufei Sun.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no Conflict of interest.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sui, Y., Sun, Y., Shi, C. et al. Opencl-pytorch: an OpenCL-based extension of PyTorch. CCF Trans. HPC (2024). https://doi.org/10.1007/s42514-024-00186-y

Download citation

Received: 24 October 2023
Accepted: 11 March 2024
Published: 08 April 2024
DOI: https://doi.org/10.1007/s42514-024-00186-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Opencl-pytorch: an OpenCL-based extension of PyTorch

Abstract

Access this article

Similar content being viewed by others

Quantitative evaluation of deep learning frameworks in heterogeneous computing environment

The Use of Deep Learning, Image Processing, and High-Performance Computing: A Systematic Mapping Study

The DeepHealth Toolkit: A Key European Free and Open-Source Software for Deep Learning and Computer Vision Ready to Exploit Heterogeneous HPC and Cloud Architectures

Data availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Opencl-pytorch: an OpenCL-based extension of PyTorch

Abstract

Access this article

Similar content being viewed by others

Quantitative evaluation of deep learning frameworks in heterogeneous computing environment

The Use of Deep Learning, Image Processing, and High-Performance Computing: A Systematic Mapping Study

The DeepHealth Toolkit: A Key European Free and Open-Source Software for Deep Learning and Computer Vision Ready to Exploit Heterogeneous HPC and Cloud Architectures

Data availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation