Hybrid Programming Using OpenSHMEM and OpenACC

  • Matthew Baker
  • Swaroop Pophale
  • Jean-Charles Vasnier
  • Haoqiang Jin
  • Oscar Hernandez
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8356)


With high performance systems exploiting multicore and accelerator-based architectures on a distributed shared memory system, heterogenous hybrid programming models are the natural choice to exploit all the hardware made available on these systems. Previous efforts looking into hybrid models have primarily focused on using OpenMP directives (for shared memory programming) with MPI (for inter-node programming on a cluster), using OpenMP to spawn threads on a node and communication libraries like MPI to communicate across nodes. As accelerators get added into the mix, and there is better hardware support for PGAS languages/APIs, this means that new and unexplored heterogenous hybrid models will be needed to effectively leverage the new hardware. In this paper we explore the use of OpenACC directives to program GPUs and the use of OpenSHMEM, a PGAS library for onesided communication between nodes. We use the NAS-BT Multi-zone benchmark that was converted to use the OpenSHMEM library API for network communication between nodes and OpenACC to exploit accelerators that are present within a node. We evaluate the performance of the benchmark and discuss our experiences during the development of the OpenSHMEM+OpenACC hybrid program.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Top500: Top 500 supercomputer sites (2013), http://www.top500.org/
  2. 2.
    Bland, B.: Titan - early experience with the titan system at oak ridge national laboratory. In: Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC 2012, pp. 2189–2211. IEEE Computer Society (2012)Google Scholar
  3. 3.
    Poole, S., Hernandez, O., Kuehn, J., Shipman, G., Curtis, A., Feind, K.: Openshmem - toward a unified rma model. In: Padua, D. (ed.) Encyclopedia of Parallel Computing, pp. 1379–1391. Springer US (2011)Google Scholar
  4. 4.
    Jin, H., der Wijngaart, R.F.V.: Performance characteristics of the multi-zone nas parallel benchmarks. In: IPDPS. IEEE Computer Society (2004)Google Scholar
  5. 5.
    OpenSHMEM Org.: Openshmem specification (2011)Google Scholar
  6. 6.
    Gokhale, M., Stone, J.: Napa c: compiling for a hybrid risc/fpga architecture. In: Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, pp. 126–135 (1998)Google Scholar
  7. 7.
    Fraguela, B.B., Renau, J., Feautrier, P., Padua, D., Torrellas, J.: Programming the flexram parallel intelligent memory system. SIGPLAN Not. 38, 49–60 (2003)CrossRefGoogle Scholar
  8. 8.
    Bellens, P., Perez, J.M., Badia, R.M., Labarta, J.: Cellss: a programming model for the cell be architecture. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC 2006. ACM, New York (2006)Google Scholar
  9. 9.
    OpenHMPP: OpenHMPP: Concepts & Directives (2012)Google Scholar
  10. 10.
    Han, T.D., Abdelrahman, T.S.: hiCUDA: a high-level directive-based language for GPU programming. In: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2, pp. 52–61. ACM, New York (2009)CrossRefGoogle Scholar
  11. 11.
    Koesterke, L., Boisseau, J., Cazes, J., Milfeld, K., Stanzione, D.: Early Experiences with the Intel Many Integrated Cores Accelerated Computing Technology. In: Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery, TG 2011, pp. 21:1–21:8. ACM, New York (2011)Google Scholar
  12. 12.
    OpenACC: How does the openacc api relate to openmp api? (2013)Google Scholar
  13. 13.
    NVIDIA: OpenACC Directives for Accelerators. In: NVIDIA Developer Zone (2012), http://developer.download.nvidia.com/CUDA/training/OpenACC_1_0_intro_jan2012.pdf
  14. 14.
    Oak Ridge Leadership Computing Facility: Introducing titan: Advancing the era of accelerated computing (2013), http://www.olcf.ornl.gov/titan/
  15. 15.
    Center for Manycore Programming, Seoul National University, Korea: Snu npb suite site (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Matthew Baker
    • 1
  • Swaroop Pophale
    • 3
  • Jean-Charles Vasnier
    • 4
  • Haoqiang Jin
    • 2
  • Oscar Hernandez
    • 1
  1. 1.Oak Ridge National LaboratoryOak RidgeUSA
  2. 2.NASA AmesUSA
  3. 3.University of HoustonHoustonUSA
  4. 4.CAPS EntrepriseFrance

Personalised recommendations