Skip to main content

An Investigation into the Performance and Portability of SYCL Compiler Implementations

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13999))

Included in the following conference series:

Abstract

In June 2022, Frontier became the first Supercomputer to “officially” break the ExaFLOP/s barrier on LINPACK, achieving a peak performance of \(1.1 \times 10^{18}\) floating-point operations per second using AMD Instinct accelerators. Developing high performance applications for such platforms typically requires the adoption of vendor-specific programming models, which in turn may limit portability. SYCL is a high-level, single-source language based on C++17, developed by the Khronos group to overcome the shortcomings of those vendor-specific HPC programming models. In this paper we present an initial study into the SYCL parallel programming model and its implementing compilers, to understand its performance and portability, and how this compares to other parallel programming models. We use three major SYCL implementations for our evaluation – Open SYCL (previously hipSYCL), DPC++, and ComputeCpp – on a range of CPU and GPU hardware from Intel, AMD, Fujitsu, Marvell, and NVIDIA. Our results show that for a simple finite difference mini-application, SYCL can offer competitive performance to native approaches, while for a more complex finite-element mini-application, significant performance degradation is observed. Our findings suggest that development work is required at the compiler- and application-level to ensure SYCL is competitive with alternative approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.top500.org/lists/top500/2022/06/.

  2. 2.

    https://opensycl.github.io.

  3. 3.

    https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler.html.

  4. 4.

    https://github.com/intel/llvm.

  5. 5.

    https://developer.codeplay.com/products/computecpp/ce/home/.

  6. 6.

    https://github.com/zjin-lcf/HeCBench.

References

  1. Beckingsale, D.A., et al.: RAJA: portable performance for large-scale scientific applications. In: IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 71–81 (2019)

    Google Scholar 

  2. Breyer, M., Van Craen, A., Pflüger, D.: A comparison of SYCL, OpenCL, CUDA, and OpenMP for massively parallel support vector machine classification on multi-vendor hardware. In: International Workshop on OpenCL (IWOCL), pp. 1–12 (2022)

    Google Scholar 

  3. Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)

    Article  Google Scholar 

  4. Deakin, T., McIntosh-Smith, S.: Evaluating the performance of HPC-style SYCL applications. In: International Workshop on OpenCL (IWOCL). ACM (2020)

    Google Scholar 

  5. Deakin, T., et al.: Performance portability across diverse computer architectures. In: IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 1–13 (2019)

    Google Scholar 

  6. Edwards, H.C., Trott, C.R., Sunderland, D.: Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distrib. Comput. (JPDC) 74(12), 3202–3216 (2014)

    Article  Google Scholar 

  7. Herdman, J.A., et al.: Accelerating hydrocodes with OpenACC, OpenCL and CUDA. In: SC Companion: High Performance Computing, Networking Storage and Analysis, pp. 465–471 (2012)

    Google Scholar 

  8. Joo, B., et al.: Performance portability of a Wilson Dslash stencil operator mini-app using Kokkos and SYCL. In: IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 14–25 (2019)

    Google Scholar 

  9. Kirk, R.O., Mudalige, G.R., Reguly, I.Z., Wright, S.A., Martineau, M.J., Jarvis, S.A.: Achieving performance portability for a heat conduction solver mini-application on modern multi-core systems. In: IEEE International Conference on Cluster Computing (CLUSTER), pp. 834–841 (2017)

    Google Scholar 

  10. Law, T.R., et al.: Performance portability of an unstructured hydrodynamics mini-application. In: IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 0–12 (2018)

    Google Scholar 

  11. Lin, P.T., Heroux, M.A., Barrett, R.F., Williams, A.B.: Assessing a mini-application as a performance proxy for a finite element method engineering application. Concurrency Comput. Pract. Experience 27(17), 5374–5389 (2015)

    Article  Google Scholar 

  12. Lin, W.C., Deakin, T., McIntosh-Smith, S.: On Measuring the Maturity of SYCL Implementations by Tracking Historical Performance Improvements. In: International Workshop on OpenCL (IWOCL). ACM (2021)

    Google Scholar 

  13. OpenACC-Standard.org: The OpenACC Application Program Interface Version 3.3 (2022). https://www.openacc.org/sites/default/files/inline-images/Specification/OpenACC-3.3-final.pdf

  14. OpenMP Architecture Review Board: OpenMP API Version 4.5 (2015). https://www.openmp.org/wp-content/uploads/openmp-4.5.pdf

  15. Pennycook, S.J., Jarvis, S.A.: Developing performance-portable molecular dynamics kernels in opencl. In: 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, pp. 386–395 (2012)

    Google Scholar 

  16. Pennycook, S.J., Sewall, J., Jacobsen, D., Deakin, T., Zamora, Y., Lee, K.L.K.: Performance, portability and productivity analysis. Library (2023). https://doi.org/10.5281/zenodo.7733678

    Article  Google Scholar 

  17. Pennycook, S.J., Hammond, S.D., Wright, S.A., Herdman, J.A., Miller, I., Jarvis, S.A.: An investigation of the performance portability of OpenCL. J. Parallel Distrib. Comput. (JPDC) 73(11), 1439–1450 (2013)

    Article  Google Scholar 

  18. Pennycook, S., Sewall, J., Lee, V.: Implications of a metric for performance portability. Futur. Gener. Comput. Syst. 92, 947–958 (2019)

    Article  Google Scholar 

  19. Reguly, I.Z., Owenson, A.M.B., Powell, A., Jarvis, S.A., Mudalige, G.R.: Under the hood of SYCL – an initial performance analysis with an unstructured-mesh CFD application. In: Chamberlain, B.L., Varbanescu, A.-L., Ltaief, H., Luszczek, P. (eds.) ISC High Performance 2021. LNCS, vol. 12728, pp. 391–410. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-78713-4_21

    Chapter  Google Scholar 

  20. Sewall, J., Pennycook, S.J., Jacobsen, D., Deakin, T., McIntosh-Smith, S.: Interpreting and visualizing performance portability metrics. In: IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 14–24 (2020)

    Google Scholar 

  21. Stone, J.E., Gohara, D., Shi, G.: OpenCL: a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12(3), 66 (2010)

    Article  Google Scholar 

  22. The Khronos SYCL Working Group: SYCL 2020 Specification (2023). https://registry.khronos.org/SYCL/specs/sycl-2020/pdf/sycl-2020.pdf

  23. Truby, D., Wright, S.A., Kevis, R., Maheswaran, S., Herdman, J.A., Jarvis, S.A.: BookLeaf: an unstructured hydrodynamics mini-application. In: IEEE International Conference on Cluster Computing (CLUSTER), pp. 615–622 (2018)

    Google Scholar 

  24. University of Bristol HPC Group: Programming Your GPU with OpenMP: A Hands-On Introduction (2022). https://github.com/UoB-HPC/openmp-tutorial

Download references

Acknowledgements

Many of the results in this paper were gathered on the Isambard UK National Tier-2 HPC Service (http://gw4.ac.uk/isambard/) operated by GW4 and the UK Met Office, and funded by EPSRC (EP/P020224/1).

Access to the Intel HD Graphics P630 GPU was provided by Intel through the Intel Developer Cloud.

The ExCALIBUR programme (https://excalibur.ac.uk/) is supported by the UKRI Strategic Priorities Fund. The programme is co-delivered by the Met Office and EPSRC in partnership with the Public Sector Research Establishment, the UK Atomic Energy Authority (UKAEA) and UKRI research councils, including NERC, MRC and STFC.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steven A. Wright .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shilpage, W.R., Wright, S.A. (2023). An Investigation into the Performance and Portability of SYCL Compiler Implementations. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds) High Performance Computing. ISC High Performance 2023. Lecture Notes in Computer Science, vol 13999. Springer, Cham. https://doi.org/10.1007/978-3-031-40843-4_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-40843-4_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-40842-7

  • Online ISBN: 978-3-031-40843-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics