MPtostream: an OpenMP compiler for CPU-GPU heterogeneous parallel systems

Yang, XueJun; Tang, Tao; Wang, GuiBin; Jia, Jia; Xu, XinHai

doi:10.1007/s11432-011-4342-4

MPtostream: an OpenMP compiler for CPU-GPU heterogeneous parallel systems

Research Paper
Published: 30 July 2011

Volume 55, pages 1961–1971, (2012)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

XueJun Yang¹,
Tao Tang¹,
GuiBin Wang¹,
Jia Jia¹ &
…
XinHai Xu¹

137 Accesses
9 Citations
Explore all metrics

Abstract

In light of GPUs’ powerful floating-point operation capacity, heterogeneous parallel systems incorporating general purpose CPUs and GPUs have become a highlight in the research field of high performance computing(HPC). However, due to the complexity of programming on GPUs, porting a large number of existing scientific computing applications to the heterogeneous parallel systems remains a big challenge. The OpenMP programming interface is widely adopted on multi-core CPUs in the field of scientific computing. To effectively inherit existing OpenMP applications and reduce the transplant cost, we extend OpenMP with a group of compiler directives, which explicitly divide tasks among the CPU and the GPU, and map time-consuming computing fragments to run on the GPU, thus dramatically simplifying the transplantation. We have designed and implemented MPtoStream, a compiler of the extended OpenMP for AMD’s stream processing GPUs. Our experimental results show that programming with the extended directives deviates from programming with OpenMP by less than 11% modification and achieves significant speedup ranging from 3.1 to 17.3 on a heterogeneous system, incorporating an Intel Xeon E5405 CPU and an AMD FireStream 9250 GPU, over the execution on the Xeon CPU alone.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerating Scientific Applications on Heterogeneous Systems with HybridOMP

Source-to-Source Parallelization Compilers for Scientific Shared-Memory Multi-core and Accelerated Multiprocessing: Analysis, Pitfalls, Enhancement and Potential

Article 08 August 2019

Using C++ AMP to Accelerate HPC Applications on Multiple Platforms

References

Owens J D, Luebke D, Govindaraju N, et al. A survey of general-purpose computation on graphics hardware. Comput Graph Forum, 2007, 26: 80–113
Article Google Scholar
Luebke D, Harris M, Krüger J, et al. GPGPU: general purpose computation on graphics hardware. In: ACM SIGGRAPH 2004 Course Notes. New York: ACM, 2004. 33
Google Scholar
Fan Z, Qiu F, Kaufman A, et al. GPU cluster for high performance computing. In: SC04: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing. Washington DC: IEEE Computer Society, 2004. 47
Google Scholar
Kirk D. Nvidia cuda software and GPU parallel computing architecture. In: ISMM 07: Proceedings of the 6th International Symposium on Memory Management. New York: ACM, 2007. 103–104
Chapter Google Scholar
Buck I. Brook Spec v0.2. Technical Report. Stanford University, 2003
Ryoo S, Rodrigues C I, Stone S S, et al. Program optimization carving for gpu computing. J Parall Distri Com, 2008, 68: 1389–1401
Article Google Scholar
Lee S, Min S J, Eigenmann R. Openmp to gpgpu: a compiler framework for automatic translation and optimization. In: PPoPP’09: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York: ACM, 2008. 101–110
Chapter Google Scholar
Han T D, Abdelrahman T S. hiCUDA: a high-level directive-based language for GPU programming. In: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units. New York: ACM, 2009. 52–61
Google Scholar
Yang X J, Yan X B, Xing Z C, et al. Fei teng 64 stream processing system: architecture, compiler, and programming. IEEE Trans Parall Distr, 2008, 20: 1142–1157
Article Google Scholar

Download references

Author information

Authors and Affiliations

National Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha, 410073, China
XueJun Yang, Tao Tang, GuiBin Wang, Jia Jia & XinHai Xu

Authors

XueJun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Tao Tang
View author publications
You can also search for this author in PubMed Google Scholar
GuiBin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jia Jia
View author publications
You can also search for this author in PubMed Google Scholar
XinHai Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Tang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, X., Tang, T., Wang, G. et al. MPtostream: an OpenMP compiler for CPU-GPU heterogeneous parallel systems. Sci. China Inf. Sci. 55, 1961–1971 (2012). https://doi.org/10.1007/s11432-011-4342-4

Download citation

Received: 04 June 2009
Accepted: 13 August 2010
Published: 30 July 2011
Issue Date: September 2012
DOI: https://doi.org/10.1007/s11432-011-4342-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MPtostream: an OpenMP compiler for CPU-GPU heterogeneous parallel systems

Abstract

Access this article

Similar content being viewed by others

Accelerating Scientific Applications on Heterogeneous Systems with HybridOMP

Source-to-Source Parallelization Compilers for Scientific Shared-Memory Multi-core and Accelerated Multiprocessing: Analysis, Pitfalls, Enhancement and Potential

Using C++ AMP to Accelerate HPC Applications on Multiple Platforms

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

MPtostream: an OpenMP compiler for CPU-GPU heterogeneous parallel systems

Abstract

Access this article

Similar content being viewed by others

Accelerating Scientific Applications on Heterogeneous Systems with HybridOMP

Source-to-Source Parallelization Compilers for Scientific Shared-Memory Multi-core and Accelerated Multiprocessing: Analysis, Pitfalls, Enhancement and Potential

Using C++ AMP to Accelerate HPC Applications on Multiple Platforms

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation