Meta-programming Applied to Automatic SMP Parallelization of Linear Algebra Code

  • Joel Falcou
  • Jocelyn Sérot
  • Lucien Pech
  • Jean-Thierry Lapresté
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5168)


We describe a software solution to the problem of automatic parallelization of linear algebra code on multi-processor and multi-core architectures. This solution relies on the definition of a domain specific language for matrix computations, a performance model for multi-processor architectures and its implementation using C++ template meta-programming. Experimental results asses this model and its implementation on sample computation kernels.


Abstract Syntax Tree Cell Processor Automatic Parallelization Expression Template Analytical Performance Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Team, T.B.: An overview of the bluegene/l supercomputer. In: Proccedings of ACM Supercomputing Conference (2002)Google Scholar
  2. 2.
    Falcou, J., Sérot, J., Chateau, T., Jurie, F.: A parallel implementation of a 3d reconstruction algorithm for real-time vision. In: PARCO 2005 - ParCo,Parallel Computing, Malaga, Spain (September 2005)Google Scholar
  3. 3.
    Kalla, R., Sinharoy, B., Tendler, J.M.: Ibm power5 chip: A dual-core multithreaded processor. IEEE Micro 24(2), 40–47 (2004)CrossRefGoogle Scholar
  4. 4.
    Kahle, J.: The cell processor architecture. In: MICRO 38: Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, p. 3. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  5. 5.
    Gepner, P., Kowalik, M.F.: Multi-core processors: New way to achieve high system performance. In: PARELEC 2006: Proceedings of the international symposium on Parallel Computing in Electrical Engineering, Washington, DC, USA, pp. 9–13. IEEE Computer Society, Los Alamitos (2006)Google Scholar
  6. 6.
    El-Ghazawi, T., Smith, L.: Upc: unified parallel c. In: SC 2006, p. 27. ACM, New York (2006)CrossRefGoogle Scholar
  7. 7.
    Bischof, H., Gorlatch, S., Leshchinskiy, R.: DatTeL: A data-parallel C++ template library. Parallel Processing Letters 13(3), 461–482 (2003)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Clark, D.: Openmp: A parallel standard for the masses. IEEE Concurrency 6(1), 10–12 (1998)CrossRefGoogle Scholar
  9. 9.
    Falcou, J., Sérot, J.: E.V.E., An Object Oriented SIMD Library. Scalable Computing: Practice and Experience 6(4), 31–41 (2005)Google Scholar
  10. 10.
    Veldhuizen, T.L.: Expression templates. C++ Report 7(5), 26–31 (1995)Google Scholar
  11. 11.
    Veldhuizen, T.L., Jernigan, M.E.: Will C++ be faster than Fortran? In: Ishikawa, Y., Reynders, J.V.W., Tholburn, M. (eds.) ISCOPE 1997. LNCS, vol. 1343. Springer, Heidelberg (1997)Google Scholar
  12. 12.
    Alexandrescu, A.: Modern C++ Design: Generic Programming and Design Patterns Applied. AW C++ in Depth Series. Addison-Wesley, Reading (2001)Google Scholar
  13. 13.
    Myers, N.: A new and useful template technique: traits. C++ gems 1, 451–457 (1996)Google Scholar
  14. 14.
    Abrahams, D., Gurtovoy, A.: C++ Template Metaprogramming: Concepts, Tools, and Techniques from Boost and Beyond. C++ in Depth Series. Addison-Wesley Professional, Reading (2004)Google Scholar
  15. 15.
    Gregor, D., et al.: The boost c++ library (2003),

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Joel Falcou
    • 1
  • Jocelyn Sérot
    • 2
  • Lucien Pech
    • 3
  • Jean-Thierry Lapresté
    • 2
  1. 1.IEFUniversité Paris SudOrsayFrance
  2. 2.LASMEAUniversité Blaise PascalClermont-FerrandFrance
  3. 3.Ecole Normale SupérieureParisFrance

Personalised recommendations