Expression Templates and OpenCL

  • Uwe Bawidamann
  • Marco Nehmeier
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7204)


In this paper we discuss the interaction of expression templates with OpenCL devices. We show how the expression tree of expression templates can be used to generate problem specific OpenCL kernels. In a second approach we use expression templates to optimize the data transfer between the host and the device which leads to a measurable performance increase in a domain specific language approach. We tested the functionality, correctness and performance for both implementations in a case study for vector and matrix operations.


GPGPU OpenCL C++ Expression templates Domain specific language Code generation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alexandrescu, A.: Modern C++ design: generic programming and design patterns applied. Addison-Wesley Longman Publishing Co., Inc., Boston (2001)Google Scholar
  2. 2.
    AMD: AMD accelerated parallel processing OpenCL programming guide, version 1.2c (April 2011)Google Scholar
  3. 3.
    Brodtkorb, A.R., Dyken, C., Hagen, T.R., Hjelmervik, J.M., Storaasli, O.O.: State-of-the-art in heterogeneous computing. Sci. Program. 18, 1–33 (2010)Google Scholar
  4. 4.
    Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design patterns: elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co., Inc., Boston (1995)Google Scholar
  5. 5.
    Härdtlein, J.: Moderne Expression Templates Programmierung. Ph.D. thesis, Universität Erlangen-Nürnberg (2007) (in German)Google Scholar
  6. 6.
    Intel: Intel OpenCL SDK user’s guide, document number 323626-001US (2011)Google Scholar
  7. 7.
    Khronos OpenCL Working Group: The OpenCL Specification, version 1.1.44 (June 2011)Google Scholar
  8. 8.
    Lerch, M., Wolff v. Gudenberg, J.: Expression templates for dot product expressions. Reliable Computing 5(1), 69–80 (1999)Google Scholar
  9. 9.
    Lippman, S.B. (ed.): C++ Gems. SIGS Publications, Inc., New York (1996)Google Scholar
  10. 10.
    Nehmeier, M.: Interval arithmetic using expression templates, template meta programming and the upcoming C++ standard. Computing 94(2), 215–228 (2012), zbMATHCrossRefGoogle Scholar
  11. 11.
    Nehmeier, M., Wolff von Gudenberg, J.: filib++, Expression Templates and the Coming Interval Standard. Reliable Computing 15(4), 312–320 (2011)MathSciNetGoogle Scholar
  12. 12.
    NVIDIA: NVIDIA CUDA C best practices guide, version 3.2 (August 2010)Google Scholar
  13. 13.
    NVIDIA: NVIDIA CUDA C programming guide, version 3.2 (November 2010)Google Scholar
  14. 14.
    NVIDIA: NVIDIA CUDA reference manual, version 3.2 Beta (August 2010)Google Scholar
  15. 15.
    NVIDIA: OpenCL Best Practices Guide (May 2010)Google Scholar
  16. 16.
    NVIDIA: OpenCL programming guide for the CUDA architecture, version 3.2 (August 2010)Google Scholar
  17. 17.
    SGI: Standard Template Library Programmer’s Guide, April 20 (2011),
  18. 18.
    The Portland Group: CUDA Fortran programming guide and reference, version 11.0 (November 2010)Google Scholar
  19. 19.
    Veldhuizen, T.: Expression templates. C++ Report 7(5), 26–31 (June 1995), reprinted in [9]Google Scholar
  20. 20.
    Veldhuizen, T.: Using C++ template metaprograms. C++ Report 7(4), 36–43 (1995), reprinted in [9]Google Scholar
  21. 21.
    Veldhuizen, T.: Techniques for scientific C++. Tech. Rep. 542, Indiana University Computer Science, version 0.4 (August 2000)Google Scholar
  22. 22.
    Wiemann, P., Wenger, S., Magnor, M.: CUDA expression templates. In: Proceedings of WSCG Communication Papers 2011, pp. 185–192 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Uwe Bawidamann
    • 1
  • Marco Nehmeier
    • 1
  1. 1.Institute of Computer ScienceUniversity of Würzburg Am HublandWürzburgGermany

Personalised recommendations