Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8356))

Included in the following conference series:


With high performance systems exploiting multicore and accelerator-based architectures on a distributed shared memory system, heterogenous hybrid programming models are the natural choice to exploit all the hardware made available on these systems. Previous efforts looking into hybrid models have primarily focused on using OpenMP directives (for shared memory programming) with MPI (for inter-node programming on a cluster), using OpenMP to spawn threads on a node and communication libraries like MPI to communicate across nodes. As accelerators get added into the mix, and there is better hardware support for PGAS languages/APIs, this means that new and unexplored heterogenous hybrid models will be needed to effectively leverage the new hardware. In this paper we explore the use of OpenACC directives to program GPUs and the use of OpenSHMEM, a PGAS library for onesided communication between nodes. We use the NAS-BT Multi-zone benchmark that was converted to use the OpenSHMEM library API for network communication between nodes and OpenACC to exploit accelerators that are present within a node. We evaluate the performance of the benchmark and discuss our experiences during the development of the OpenSHMEM+OpenACC hybrid program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others


  1. Top500: Top 500 supercomputer sites (2013),

  2. Bland, B.: Titan - early experience with the titan system at oak ridge national laboratory. In: Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC 2012, pp. 2189–2211. IEEE Computer Society (2012)

    Google Scholar 

  3. Poole, S., Hernandez, O., Kuehn, J., Shipman, G., Curtis, A., Feind, K.: Openshmem - toward a unified rma model. In: Padua, D. (ed.) Encyclopedia of Parallel Computing, pp. 1379–1391. Springer US (2011)

    Google Scholar 

  4. Jin, H., der Wijngaart, R.F.V.: Performance characteristics of the multi-zone nas parallel benchmarks. In: IPDPS. IEEE Computer Society (2004)

    Google Scholar 

  5. OpenSHMEM Org.: Openshmem specification (2011)

    Google Scholar 

  6. Gokhale, M., Stone, J.: Napa c: compiling for a hybrid risc/fpga architecture. In: Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, pp. 126–135 (1998)

    Google Scholar 

  7. Fraguela, B.B., Renau, J., Feautrier, P., Padua, D., Torrellas, J.: Programming the flexram parallel intelligent memory system. SIGPLAN Not. 38, 49–60 (2003)

    Article  Google Scholar 

  8. Bellens, P., Perez, J.M., Badia, R.M., Labarta, J.: Cellss: a programming model for the cell be architecture. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC 2006. ACM, New York (2006)

    Google Scholar 

  9. OpenHMPP: OpenHMPP: Concepts & Directives (2012)

    Google Scholar 

  10. Han, T.D., Abdelrahman, T.S.: hiCUDA: a high-level directive-based language for GPU programming. In: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2, pp. 52–61. ACM, New York (2009)

    Chapter  Google Scholar 

  11. Koesterke, L., Boisseau, J., Cazes, J., Milfeld, K., Stanzione, D.: Early Experiences with the Intel Many Integrated Cores Accelerated Computing Technology. In: Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery, TG 2011, pp. 21:1–21:8. ACM, New York (2011)

    Google Scholar 

  12. OpenACC: How does the openacc api relate to openmp api? (2013)

    Google Scholar 

  13. NVIDIA: OpenACC Directives for Accelerators. In: NVIDIA Developer Zone (2012),

  14. Oak Ridge Leadership Computing Facility: Introducing titan: Advancing the era of accelerated computing (2013),

  15. Center for Manycore Programming, Seoul National University, Korea: Snu npb suite site (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Baker, M., Pophale, S., Vasnier, JC., Jin, H., Hernandez, O. (2014). Hybrid Programming Using OpenSHMEM and OpenACC. In: Poole, S., Hernandez, O., Shamis, P. (eds) OpenSHMEM and Related Technologies. Experiences, Implementations, and Tools. OpenSHMEM 2014. Lecture Notes in Computer Science, vol 8356. Springer, Cham.

Download citation

  • DOI:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-05214-4

  • Online ISBN: 978-3-319-05215-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics