Euro-Par 2011: Euro-Par 2011: Parallel Processing Workshops pp 429-439 | Cite as
An Extension of XcalableMP PGAS Lanaguage for Multi-node GPU Clusters
Abstract
A GPU is a promising device for further increasing computing performance in high performance computing field. Currently, many programming langauges are proposed for the GPU offloaded from the host, as well as CUDA. However, parallel programming with a multi-node GPU cluster, where each node has one or more GPUs, is a hard work. Users have to describe multi-level parallelism, both between nodes and within the GPU using MPI and a GPGPU language like CUDA. In this paper, we will propose a parallel programming language targeting multi-node GPU clusters. We extend XcalableMP, a parallel PGAS (Partitioned Global Address Space) programming language for PC clusters, to provide a productive parallel programming model for multi-node GPU clusters. Our performance evaluation with the N-body problem demonstrated that not only does our model achieve scalable performance, but it also increases productivity since it only requires small modifications to the serial code.
Preview
Unable to display preview. Download preview PDF.
References
- 1.XcalableMP Official Website, http://www.xcalablemp.org
- 2.OpenMP.org, http://openmp.org/wp
- 3.Rice University. High Performance Fortran Forum, http://hpff.rice.edu
- 4.Lee, J., Sato, M.: Implementation and Performance Evaluation of XcalableMP: A Parallel Programming Language for Distributed Memory Systems. In: 39th International Conference on Parallel Processing Workshops, pp. 413–420 (2010)Google Scholar
- 5.Lee, S., Eigenmann, R.: OpenMPC: Extended OpenMP Programming and Tuning for GPUs. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010, pp. 1–11 (2010)Google Scholar
- 6.Ohshima, S., Hirasawa, S., Honda, H.: OMPCUDA: OpenMP Execution Framework for CUDA Based on Omni OpenMP Compiler. In: Sato, M., Hanawa, T., Müller, M.S., Chapman, B.M., de Supinski, B.R. (eds.) IWOMP 2010. LNCS, vol. 6132, pp. 161–173. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 7.PGI Accelerator Compilers, http://www.pgroup.com/resources/accel.htm
- 8.HMPP Workbench, http://www.caps-entreprise.com/hmpp.html
- 9.Hargrove, P.H., Min, S.-J., Zheng, Y., Iancu, C., Yelick, K.: Extending Unified Parallel C for GPU Computing, http://upc.lbl.gov/publications/UPC_with_GPU-SIAMPP10-Zheng.pdf