Generic Parallel Programming Using C++ Templates and Skeletons

  • Holger Bischof
  • Sergei Gorlatch
  • Roman Leshchinskiy
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3016)


We study how the concept of generic programming using C++ templates, realized in the Standard Template Library (STL), can be efficiently exploited in the specific domain of parallel programming. We present our approach, implemented in the DatTeL data-parallel library, which allows simple programming for various parallel architectures while staying within the paradigm of classical C++ template programming. The novelty of the DatTeL is the use of higher-order parallel constructs, skeletons, in the STL-context and the easy extensibility of the library with new, domain-specific skeletons. We describe the principles of our approach based on skeletons, and explain our design decisions and their implementation in the library. The presentation is illustrated with a case study – the parallelization of a generic algorithm for carry-lookahead addition. We compare the DatTeL to related work and report both absolute performance and speedups achieved for the case study on parallel machines with shared and distributed memory.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Stepanov, A., Lee, M.: The Standard Template Library. Technical Report HPL- 95-11, Hewlett-Packard Laboratories (1995)Google Scholar
  2. 2.
    Veldhuizen, T.: Using C++ template metaprograms. C++ Report 7, 36–43 (1995); Reprinted in C++ Gems, ed. Stanley LippmanGoogle Scholar
  3. 3.
    Myers, N.: Traits: a new and useful template technique. C++ Report (1995)Google Scholar
  4. 4.
    Veldhuizen, T.: Expression templates. C++ Report 7, 26–31 (1995)Google Scholar
  5. 5.
    Veldhuizen, T.L., Gannon, D.: Active libraries: Rethinking the roles of compilers and libraries. In: Proceedings of the SIAM Workshop on Object Oriented Methods for Inter-operable Scientific and Engineering Computing (OO 1998). SIAM Press, Philadelphia (1998)Google Scholar
  6. 6.
    Veldhuizen, T.L.: Arrays in Blitz++. In: Caromel, D., Oldehoeft, R.R., Tholburn, M. (eds.) ISCOPE 1998. LNCS, vol. 1505, pp. 223–230. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  7. 7.
    Siek, J.G., Lumsdaine, A.: The Matrix Template Library: A generic programming approach to high performance numerical linear algebra. In: Caromel, D., Oldehoeft, R.R., Tholburn, M. (eds.) ISCOPE 1998. LNCS, vol. 1505, pp. 59–70. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  8. 8.
    Karmesin, S., et al.: Array design and expression evaluation in POOMA II. In: Caromel, D., Oldehoeft, R.R., Tholburn, M. (eds.) ISCOPE 1998. LNCS, vol. 1505, pp. 231–238. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  9. 9.
    Danelutto, M., Pasqualetti, F., Pelagatti, S.: Skeletons for data parallelism in P3L. In: Lengauer, C., Griebl, M., Gorlatch, S. (eds.) Euro-Par 1997. LNCS, vol. 1300, pp. 619–628. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  10. 10.
    Herrmann, C.A., Lengauer, C.: HDC: A higher-order language for divide-and-conquer. Parallel Processing Letters 10, 239–250 (2000)CrossRefGoogle Scholar
  11. 11.
    Kuchen, H.: A skeleton library. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, pp. 620–629. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  12. 12.
    Cole, M.: eSkel library home page,
  13. 13.
    Blelloch, G.E.: Programming parallel algorithms. Communications of the ACM 39, 85–97 (1996)CrossRefGoogle Scholar
  14. 14.
    Leighton, F.T.: Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. Morgan Kaufmann Publ., San Francisco (1992)MATHGoogle Scholar
  15. 15.
    Koenig, A., Stroustrup, B.: As close as possible to C – but no closer. The C++ Report 1 (1989)Google Scholar
  16. 16.
    Grundmann, T., Ritt, M., Rosenstiel, W.: TPO++: An object-oriented message-passing library in C++. In: International Conference on Parallel Processing, pp. 43–50 (2000)Google Scholar
  17. 17.
    Alexandrescu, A.: Modern C++ Design. Addison-Wesley, Reading (2001)Google Scholar
  18. 18.
    Chakravarty, M.M.T., Keller, G.: More types for nested data parallel programming. In: Wadler, P. (ed.) Proceedings of the Fifth ACM SIGPLAN International Conference on Functional Programming (ICFP 2000), pp. 94–105. ACM Press, New York (2000)CrossRefGoogle Scholar
  19. 19.
    Blelloch, G.E., Chatterjee, S., Hardwick, J.C., Sipelstein, J., Zagha, M.: Implementation of a portable nested data-parallel language. Journal of Parallel and Distributed Computing 21, 4–14 (1994)CrossRefGoogle Scholar
  20. 20.
    Pfannenstiel, W., et al.: Aspects of the compilation of nested parallel imperative languages. In: Werner, B. (ed.) Third Working Conference on Programming Models for Massively Parallel Computers, pp. 102–109. IEEE Computer Society, Los Alamitos (1998)CrossRefGoogle Scholar
  21. 21.
    Wilson, G., Lu, P. (eds.): Parallel Programming using C++. MIT Press, Cambridge (1996)Google Scholar
  22. 22.
    Dabrowski, F., Loulergue, F.: Functional bulk synchronous programming in C++. In: 21st IASTED International Multi-conference, AI 2003, Symposium on Parallel and Distributed Computing and Networks, pp. 462–467. ACTA Press (2003)Google Scholar
  23. 23.
    Danelutto, M., Ratti, D.: Skeletons in MPI. In: Aki, S., Gonzales, T. (eds.) Proceedings of the 14th IASTED International Conference on Parallel and Distributed Computing and Systems, pp. 392–397. ACTA Press (2002)Google Scholar
  24. 24.
    Johnson, E., Gannon, D.: Programming with the HPC++ Parallel Standard Template Library. In: Proceedings of the 8th SIAM Conference on Parallel Processing for Scientific Computing, PPSC 1997, Minneapolis. SIAM, Philadelphia (1997)Google Scholar
  25. 25.
    Rauchwerger, L., Arzu, F., Ouchi, K.: Standard templates adaptive parallel library. In: O’Hallaron, D.R. (ed.) LCR 1998. LNCS, vol. 1511, pp. 402–409. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  26. 26.
    Lechtchinsky, R., Chakravarty, M.M.T., Keller, G.: Costing nested array codes. Parallel Processing Letters 12, 249–266 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Holger Bischof
    • 1
  • Sergei Gorlatch
    • 1
  • Roman Leshchinskiy
    • 2
  1. 1.University of MünsterGermany
  2. 2.Technical University of BerlinGermany

Personalised recommendations