Fast Expression Templates

  • Jochen Härdtlein
  • Alexander Linke
  • Christoph Pflaum
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3515)


Expression templates (ET) can significantly reduce the implementation effort of mathematical software. For some compilers, especially for those of supercomputers, however, it can be observed that classical ET implementations do not deliver the expected performance. This is because aliasing of pointers in combination with the complicated ET constructs becomes much more difficult. Therefore, we introduced the concept of enumerated variables, which are provided with an additional integer template parameter. Based on this new implementation of ET we obtain a C++ code whose performance is very close to the handcrafted C code. The performance results of these so-called Fast ET are presented for the Hitachi SR8000 supercomputer and the NEC SX6, both with automatic vectorization and parallelization. Additionally we studied the combination of Fast ET and OpenMP on a high performance Opteron cluster.


Triad Encapsulation Aliasing 


  1. 1.
    Bassetti, F., Davis, K., Quinlan, D.: Towards Fortran 77 Performance from Object-Oriented C++ Scientific Framework: HPC 1998, April 5-9 (1998)Google Scholar
  2. 2.
    Bassetti, F., Davis, K., Quinlan, D.: C++ Expression Templates Performance Issues in Scientific Computing. CRPC-TR97705-S (October 1997)Google Scholar
  3. 3.
    Czarnecki, K., Eisenecker, U.: Generative Programming: Methods, Tools, and Applications. Addison-Wesley, Boston (2000)Google Scholar
  4. 4.
    München, L.-R.: The Hitachi SR8000-F1, System Description,
  5. 5.
    High Performance Computing Center Stuttgart: The NEC SX-6 Cluster Documentation,
  6. 6.
    Department of Computer Science 10, System Simulation, Erlangen: HPC Cluster,
  7. 7.
    Los Alamos National Laboratories: PETE - Portable Expression Templates Engine,
  8. 8.
    Pflaum, C.: Expression Templates for Partial Differential Equations. Comput. Visual. Sci. 4, 1–8 (2001)MATHCrossRefGoogle Scholar
  9. 9.
    Los Alamos National Laboratories: POOMA,
  10. 10.
    Veldhuizen, T.: Using C++ Template Metaprograms. C++ Report 7(4), 36–43 (1995)Google Scholar
  11. 11.
    Veldhuizen, T.: Expression Templates. C++ Report 7(5), 26–31 (1995)Google Scholar
  12. 12.
    Veldhuizen, T.: Will C++ be faster than Fortran? In: Ishikawa, Y., Reynders, J.V.W., Tholburn, M. (eds.) ISCOPE 1997. LNCS, vol. 1343, Springer, Heidelberg (1997)CrossRefGoogle Scholar
  13. 13.
    Veldhuizen, T.: Blitz++,
  14. 14.
    Veldhuizen, T.: Techniques for Scientific C++. Indiana University Computer Science Technical Report No 542, Version 0.4 (August 2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Jochen Härdtlein
    • 1
  • Alexander Linke
    • 1
  • Christoph Pflaum
    • 1
  1. 1.Department of Computer Science 10, System Simulation GroupUniversity of ErlangenErlangenGermany

Personalised recommendations