Research on Matrix Multiplication Based on the Combination of OpenACC and CUDA

Wang, Yuexing

doi:10.1007/978-981-13-7025-0_10

Yuexing Wang¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 980))

Included in the following conference series:

International Conference on Geo-Informatics in Sustainable Ecosystem and Society

843 Accesses
1 Citations

Abstract

With the improvement of GPU’s general computing capacity, the use of parallel computing to solve some difficult problems with large amount of data and intensive computing tasks has become the trend of the times. In GPU general computing, CUDA and OpenCL have been widely used and studied. However, the two parallel programming models generally exist the weakness that whose API is too close to the underlying hardware, which makes programming inefficient and is not suitable for the large-scale parallel tasks that require rapid implementation. OpenACC is a relatively advanced and simple programming language, which can achieve rapid parallelization, but the computing effect of the program is relatively low (generally lower than CUDA). Therefore, this paper tries to combine CUDA and OpenACC for mixed parallelization. This way not only greatly reduces the workload of code conversion, but also has a computing performance no less than a pure CUDA program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Harris, M., et al.: GPGPU: general purpose computation on graphics hardware. In: ACM SIGGRAPH 2004 Course Notes, p. 33. ACM (2004)
Google Scholar
Yang, Y., et al.: An optimizing compiler for GPGPU programs with input-data sharing. In: ACM Sigplan Symposium on Principles & Practice of Parallel Programming, pp. 343–344. ACM (2010)
Google Scholar
Giunta, G., Montella, R., Agrillo, G., Coviello, G.: A GPGPU transparent virtualization component for high performance computing clouds. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010. LNCS, vol. 6271, pp. 379–391. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15277-1_37
Chapter Google Scholar
Lee, S., Min, S.J., Eigenmann, R.: OpenMP to GPGPU: a compiler framework for automatic translation and optimization. In: ACM Sigplan Symposium on Principles and Practice of Parallel Programming, pp. 101–110. ACM (2009)
Google Scholar
Han, T.D., Abdelrahman, T.S.: hiCUDA: high-Level GPGPU programming. IEEE Trans. Parallel Distrib. Syst. 22(1), 78–90 (2010)
Article Google Scholar
Kessler, C., et al.: Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption. In: The Workshop on Adaptive Resource Management & Scheduling for Cloud Computing, pp. 1–6. ACM (2017)
Google Scholar
Komatsu, K., et al.: Translation of large-scale simulation codes for an OpenACC platform using the xevolver framework. Int. J. Networking Comput. 6(2), 167–180 (2017)
Article Google Scholar
Rostami, R.M., Ghaffari-Miab, M.: Fast computation of finite difference generated time-domain Green’s functions of layered media using OpenAcc on graphics processors. In: Iranian Conference on Electrical Engineering (2017)
Google Scholar
Pereira, A.D., et al.: Enabling efficient stencil code generation in OpenACC. Procedia Comput. Sci. 108, 2333–2337 (2017)
Article Google Scholar
Calore, E., Kraus, J., Schifano, S.F., Tripiccione, R.: Accelerating lattice boltzmann applications with OpenACC. In: Träff, J.L., Hunold, S., Versaci, F. (eds.) Euro-Par 2015. LNCS, vol. 9233, pp. 613–624. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48096-0_47
Chapter Google Scholar
Feki, S., Al-Jarro, A., Bagci, H.: Multi-GPU-based acceleration of the explicit time domain volume integral equation solver using MPI-OpenACC. Radio Science Meeting, p. 90. IEEE (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Hebei University of Engineering, Han Dan, 056000, Hebei, China
Yuexing Wang

Authors

Yuexing Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuexing Wang .

Editor information

Editors and Affiliations

Eastern Michigan University, Ypsilanti, MI, USA
Yichun Xie
Hebei University of Engineering, Handan, China
Anbing Zhang
Hebei University of Engineering, Handan, Hebei, China
Haixin Liu
Hebei University of Engineering, Handan, China
Lili Feng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Y. (2019). Research on Matrix Multiplication Based on the Combination of OpenACC and CUDA. In: Xie, Y., Zhang, A., Liu, H., Feng, L. (eds) Geo-informatics in Sustainable Ecosystem and Society. GSES 2018. Communications in Computer and Information Science, vol 980. Springer, Singapore. https://doi.org/10.1007/978-981-13-7025-0_10

Download citation

DOI: https://doi.org/10.1007/978-981-13-7025-0_10
Published: 27 February 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-7024-3
Online ISBN: 978-981-13-7025-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics