A Pattern-Based Comparison of OpenACC and OpenMP for Accelerator Computing

  • Sandra Wienke
  • Christian Terboven
  • James C. Beyer
  • Matthias S. Müller
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8632)

Abstract

Nowadays, HPC systems frequently emerge as clusters of commodity processors with attached accelerators. Moving from tedious low-level accelerator programming to increased development productivity, the directive-based programming models OpenACC and OpenMP are promising candidates. While OpenACC was completed about two years ago, OpenMP just recently added support for accelerator programming. To assist developers in their decision-making which approach to take, we compare both models with respect to their programmability. Besides investigating their expressiveness by putting their constructs side by side, we focus on the evaluation of their power based on structured parallel programming patterns (aka algorithmic skeletons). These patterns describe the basic entities of parallel algorithms of which we cover the patterns map, stencil, reduction, fork-join, superscalar sequence, nesting and geometric decomposition. Architectural targets of this work are NVIDIA-type accelerators (GPUs) and specialties of Intel-type accelerators (Xeon Phis). Additionally, we assess the prospects of OpenACC and OpenMP concerning future development in soft- and hardware design.

Keywords

OpenACC OpenMP 4 GPU Xeon Phi programmability parallel patterns 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Asanovic, K., Bodik, R., Catanzaro, B.C., Gebis, J.J., Husbands, P., Keutzer, K., Patterson, D.A., Plishker, W.L., Shalf, J., Williams, S.W., Yelick, K.A.: The Landscape of Parallel Computing Research: A View from Berkeley. Tech. Rep. UCB/EECS-2006-183 (2006)Google Scholar
  2. 2.
    Beyer, J.C.: OpenACC 2.0 vs OpenMP 4.0 Programming Comparison. GTC Express Webinars, ID GTCE058 (2013)Google Scholar
  3. 3.
    Beyer, J.C., Stotzer, E.J., Hart, A., de Supinski, B.R.: OpenMP for Accelerators. In: Chapman, B.M., Gropp, W.D., Kumaran, K., Müller, M.S. (eds.) IWOMP 2011. LNCS, vol. 6665, pp. 108–121. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  4. 4.
    Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J., Lee, S.H., Skadron, K.: Rodinia: A benchmark suite for heterogeneous computing. In: IEEE International Symposium on Workload Characterization, IISWC 2009, pp. 44–54 (2009)Google Scholar
  5. 5.
    Cole, M.: Algorithmic Skeletons: Structured Management of Parallel Computation. MIT Press, Cambridge (1991)Google Scholar
  6. 6.
    Ghosh, S., Liao, T., Calandra, H., Chapman, B.: Experiences with OpenMP, PGI, HMPP and OpenACC Directives on ISO/TTI Kernels. In: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion, pp. 691–700 (2012)Google Scholar
  7. 7.
    Hoshino, T., Maruyama, N., Matsuoka, S., Takaki, R.: CUDA vs OpenACC: Performance Case Studies with Kernel Benchmarks and a Memory-Bound CFD Application. In: 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 136–143 (2013)Google Scholar
  8. 8.
    Khronos OpenCL Working Group: The OpenCL Specification, v2.0 (2014)Google Scholar
  9. 9.
    Lee, S., Vetter, J.S.: Early Evaluation of Directive-based GPU Programming Models for Productive Exascale Computing. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 23:1–23:11. IEEE Computer Society Press, Los Alamitos (2012)Google Scholar
  10. 10.
    Liao, C., Yan, Y., de Supinski, B.R., Quinlan, D.J., Chapman, B.: Early Experiences with the OpenMP Accelerator Model. In: Rendell, A.P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2013. LNCS, vol. 8122, pp. 84–98. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  11. 11.
    Mattson, T., Sanders, B., Massingill, B.: Patterns for Parallel Programming, 1st edn. Addison-Wesley Professional (2004)Google Scholar
  12. 12.
    McCool, M., Reinders, J., Robison, A.: Structured Parallel Programming: Patterns for Efficient Computation, 1st edn. Morgan Kaufmann (2012)Google Scholar
  13. 13.
    OpenACC-Standard.org: The OpenACC Application Programming Interface, v2.0 (2013)Google Scholar
  14. 14.
    OpenMP ARB: OpenMP Application Program Interface, v. 4.0 (2013)Google Scholar
  15. 15.
    Reyes, R., Lopez, I., Fumero, J., De Sande, F.: Directive-based Programming for GPUs: A Comparative Study. In: 2012 IEEE 14th International Conference on High Performance Computing and Communication 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), pp. 410–417 (2012)Google Scholar
  16. 16.
    Wang, Y., Qin, Q., See, S.C.W., Lin, J.: Performance Portability Evaluation for OpenACC on Intel Knights Corner and Nvidia Kepler. HPC China (2013)Google Scholar
  17. 17.
    Wienke, S., Springer, P., Terboven, C., an Mey, D.: OpenACC – First Experiences with Real-World Applications. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds.) Euro-Par 2012. LNCS, vol. 7484, pp. 859–870. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  18. 18.
    Wienke, S., Terboven, C., Beyer, J.C., Müller, M.S.: A Pattern-Based Comparison of OpenACC and OpenMP for Accelerator Computing, slides (2014), https://sharepoint.campus.rwth-aachen.de/units/rz/HPC/public/Shared%20Documents/WienkeEtAl_OpenACC-OpenMP-PatternComparison.pdf
  19. 19.
    Wolfe, M.: Compilers and More: Accelerated Programming. HPC Wire (2013)Google Scholar
  20. 20.
    Wolfe, M.: Programming Heterogeneous X64+GPU Systems Using OpenACC. IEEE Comupter Society Webinar (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Sandra Wienke
    • 1
    • 2
  • Christian Terboven
    • 1
    • 2
  • James C. Beyer
    • 3
  • Matthias S. Müller
    • 1
    • 2
  1. 1.IT CenterRWTH Aachen UniversityAachenGermany
  2. 2.JARA – High-Performance ComputingAachenGermany
  3. 3.Cray Inc.St. PaulUSA

Personalised recommendations