Skip to main content

Exploring Functional Acceleration of OpenCL on FPGAs and GPUs Through Platform-Independent Optimizations

  • Conference paper
  • First Online:
Applied Reconfigurable Computing. Architectures, Tools, and Applications (ARC 2018)

Abstract

OpenCL has been proposed as a means of accelerating functional computation using FPGA and GPU accelerators. Although it provides ease of programmability and code portability, questions remain about the performance portability and underlying vendor’s compiler capabilities to generate efficient implementations without user-defined, platform specific optimizations. In this work, we systematically evaluate this by formalizing a design space exploration strategy using platform-independent micro-architectural and application-specific optimizations only. The optimizations are then applied across Altera FPGA, NVIDIA GPU and ARM Mali GPU platforms for three computing examples, namely matrix-matrix multiplication, binomial-tree option pricing and 3-dimensional finite difference time domain. Our strategy enables a fair comparison across platforms in terms of throughput and energy efficiency by using the same design effort. Our results indicate that FPGA provides better performance portability in terms of achieved percentage of device’s peak performance (68%) compared to NVIDIA GPU (20%) and also achieves better energy efficiency (up to 1.4\(\times \)) for some of the considered cases without requiring in-depth hardware design expertise.

The original version of this chapter has been revised: “George” was used instead of “Georgios” as the first name of the third author. This has been corrected. The erratum to this chapter is available at https://doi.org/10.1007/978-3-319-78890-6_60

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Schadt, E.E., et al.: Computational solutions to large-scale data management and analysis. Nat. Rev. Genet. 11(9), 647–657 (2010)

    Article  Google Scholar 

  2. Stone, J.E., Gohara, D., Shi, G.: OpenCL - a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12(3), 66–73 (2010)

    Article  Google Scholar 

  3. Barr, J.: Developer preview – EC2 instances (F1) with programmable hardware. Amazon Web Services (2016)

    Google Scholar 

  4. Hill, K., et al.: Comparative analysis of OpenCL vs. HDL with image-processing kernels on Stratix-V FPGA. In: IEEE International Conference on ASAP (2015)

    Google Scholar 

  5. Fang, J., Varbanescu, A.L., Sips, H.: A comprehensive performance comparison of CUDA and OpenCL. In: IEEE ICPP (2011)

    Google Scholar 

  6. Rul, S., et al.: An experimental study on performance portability of OpenCL kernels. In: Symposium on Application Accelerators in High Performance Computing (2010)

    Google Scholar 

  7. Chen, D., Singh, D.: Fractal video compression in OpenCL: an evaluation of CPUs, GPUs, and FPGAs as acceleration platforms. In: ASP-DAC. IEEE (2013)

    Google Scholar 

  8. Zohouri, H.R., et al.: Evaluating and optimizing OpenCL kernels for high performance computing with FPGAs. In: Proceedings of IEEE/ACM Supercomputing Conference (2016)

    Google Scholar 

  9. Berkeley Design Technology, Inc.: Floating-point DSP design flow and performance on Altera 28-nm FPGAs. In: Independent Analysis (2012)

    Google Scholar 

  10. Giefers, H., Polig, R., Hagleitner, C.: Analyzing the energy-efficiency of dense linear algebra kernels by power-profiling a hybrid CPU/FPGA system. In: 25th International Conference on Application-Specific Systems, Architectures and Processors. IEEE (2014)

    Google Scholar 

  11. Gronqvist, J., Lokhmotov, A.: Optimising OpenCL kernels for the ARM Mali-T600 GPUs. In: GPU Pro 5: Advanced Rendering Techniques, p. 327 (2014)

    Chapter  Google Scholar 

  12. NVIDIA, CUDA: Basic Linear Algebra Subroutines (cuBLAS) library (2013)

    Google Scholar 

Download references

Acknowledgment

The work was supported by the European Commission under European Horizon 2020 Programme, grant number 6876281 (VINEYARD).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Umar Ibrahim Minhas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Minhas, U.I., Woods, R., Karakonstantis, G. (2018). Exploring Functional Acceleration of OpenCL on FPGAs and GPUs Through Platform-Independent Optimizations. In: Voros, N., Huebner, M., Keramidas, G., Goehringer, D., Antonopoulos, C., Diniz, P. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2018. Lecture Notes in Computer Science(), vol 10824. Springer, Cham. https://doi.org/10.1007/978-3-319-78890-6_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-78890-6_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-78889-0

  • Online ISBN: 978-3-319-78890-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics