Design and Implementation of a Cost-Optimal Parallel Tridiagonal System Solver Using Skeletons

  • Holger Bischof
  • Sergei Gorlatch
  • Emanuel Kitzelmann
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2763)


We address the problem of systematically designing correct parallel programs and developing their efficient implementations on parallel machines. The design process starts with an intuitive, sequential algorithm and proceeds by expressing it in terms of well-defined, pre-implemented parallel components called skeletons. We demonstrate the skeleton-based design process using the tridiagonal system solver as our example application. We develop step by step three provably correct, parallel versions of our application, and finally arrive at a cost-optimal implementation in MPI (Message Passing Interface). The performance of our solutions is demonstrated experimentally on a Cray T3E machine.


Parallel Machine Message Passing Interface Parallel Implementation Sequential Algorithm Generic Implementation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cole, M.I.: Algorithmic Skeletons: A Structured Approach to the Management of Parallel Computation. PhD thesis, University of Edinburgh (1988)Google Scholar
  2. 2.
    López, J., Zapata, E.L.: Unified architecture for divide and conquer based tridiagonal system solvers. IEEE Transactions on Computers 43, 1413–1424 (1994)CrossRefGoogle Scholar
  3. 3.
    Leighton, F.T.: Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. Morgan Kaufmann Publ., San Francisco (1992)zbMATHGoogle Scholar
  4. 4.
    Quinn, M.J.: Parallel Computing. McGraw-Hill, Inc., New York (1994)Google Scholar
  5. 5.
    Gorlatch, S.: Systematic efficient parallelization of scan and other list homomorphisms. In: Fraigniaud, P., Mignotte, A., Robert, Y., Bougé, L. (eds.) Euro-Par 1996. LNCS, vol. 1124, pp. 401–408. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  6. 6.
    Gorlatch, S., Bischof, H.: A generic MPI implementation for a data-parallel skeleton: Formal derivation and application to FFT. Parallel Processing Letters 8, 447–458 (1998)CrossRefGoogle Scholar
  7. 7.
    Gorlatch, S.: Extracting and implementing list homomorphisms in parallel program development. Science of Computer Programming 33, 1–27 (1998)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Bischof, H., Gorlatch, S.: Double-scan: Introducing and implementing a new dataparallel skeleton. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, pp. 640–647. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Bischof, H., Gorlatch, S., Kitzelmann, E.: The double-scan skeleton and its parallelization. Technical Report 2002/06, Technische Universität Berlin (2002)Google Scholar
  10. 10.
    Pelagatti, S.: Structured development of parallel programs. Taylor&Francis, London (1998)Google Scholar
  11. 11.
    Botorog, G., Kuchen, H.: Efficient parallel programming with algorithmic skeletons. In: Fraigniaud, P., Mignotte, A., Bougé, L., Robert, Y. (eds.) Euro-Par 1996. LNCS, vol. 1123, pp. 718–731. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  12. 12.
    Breitinger, S., Loogen, R., Ortega-Mallén, Y., Peña, R.: The Eden coordination model for distributed memory systems. In: High-Level Parallel Programming Models and Supportive Environments (HIPS). IEEE Press, Los Alamitos (1997)Google Scholar
  13. 13.
    Herrmann, C.A., Lengauer, C.: HDC: A higher-order language for divide-andconquer. Parallel Processing Letters 10, 239–250 (2000)CrossRefGoogle Scholar
  14. 14.
    Stone, H.S.: An efficient parallel algorithm for the solution of a tridiagonal system of equations. ACM 20, 27–38 (1973)CrossRefzbMATHGoogle Scholar
  15. 15.
    Hockney, R.W., Jesshope, C.R.: Parallel Computers. Adam Hilger, Philadelphia (1988)zbMATHGoogle Scholar
  16. 16.
    Hockney, R.W.: A fast direct solution of poisson’s equation using fourier analysis. JACM 12, 95–113 (1965)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Wang, X., Mou, Z.: A divide-and-conquer method of solving tridiagonal systems on hypercube massively parallel computers. In: Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing, pp. 810–816. IEEE Computer Society Press, Los Alamitos (1991)CrossRefGoogle Scholar
  18. 18.
    Wang, H.H.: A parallel method for tridiagonal equations. ACM Transactions on Mathematical Software 7, 170–183 (1982)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Holger Bischof
    • 1
  • Sergei Gorlatch
    • 1
  • Emanuel Kitzelmann
    • 1
  1. 1.Technische Universität BerlinGermany

Personalised recommendations