Transformation of divide & conquer to nested parallel loops

  • Christoph A. Herrmann
  • Christian Lengauer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1292)


We propose a sequence of equational transformations and specializations which turns a divide-and-conquer skeleton in Haskell into a parallel loop nest in C. Our initial skeleton is often viewed as general divide-and-conquer. The specializations impose a balanced call tree, a fixed degree of the problem division, and elementwise operations. Our goal is to select parallel implementations of divide-and-conquer via a space-time mapping, which can be determined at compile time. The correctness of our transformations is proved by equational reasoning in Haskell; recursion and iteration are handled by induction. Finally, we demonstrate the practicality of the skeleton by expressing Strassen's matrix multiplication in it.


divide-and-conquer equational reasoning Haskell parallelization skeleton space-time mapping 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    K. Achatz and W. Schulte. Architecture independent massive parallelization of divide-and-conquer algorithms. In Mathematics of Program Construction, Lecture Notes in Computer Science 947, pages 97–127. Springer-Verlag, 1995.Google Scholar
  2. 2.
    A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Series in Computer Science and Information Processing. Addison-Wesley, 1974.Google Scholar
  3. 3.
    R. S. Bird. Lectures on constructive functional programming. In M. Broy, editor, Constructive Methods in Computing Science, NATO ASI Series F: Computer and Systems Sciences, Vol. 55, pages 151–216. Springer-Verlag, 1988.Google Scholar
  4. 4.
    M. I. Cole. Algorithmic Skeletons: Structured Management of Parallel Computation. Research Monographs in Parallel and Distributed Computing. Pitman, 1989.Google Scholar
  5. 5.
    J. Darlington, A. Field, P. Harrison, P. Kelly, D. Sharp, Q. Wu, and R. While. Parallel programming using skeleton functions. In A. Bode, M. Reeve, and G. Wolf, editors, Parallel Architectures and Languages Europe (PARLE '93), Lecture Notes in Computer Science 694, pages 146–160. Springer-Verlag, 1993.Google Scholar
  6. 6.
    S. Gorlatch. Systematic efficient parallelization of scan and other list homomorphisms. In L. Bougé, P. Fraigniaud, A. Mignotte, and Y. Robert, editors, Euro-Par'96, Lecture Notes in Computer Science 1124, pages 401–408. Springer-Verlag, 1996.Google Scholar
  7. 7.
    S. Gorlatch. Systematic extraction and implementation of divide-and-conquer parallelism. In H. Kuchen and D. Swierstra, editors, Programming Languages: Implementation, Logics and Programs, Lecture Notes in Computer Science 1140, pages 274–288. Springer-Verlag, 1996.Google Scholar
  8. 8.
    S. Gorlatch and H. Bischof. Formal derivation of divide-and-conquer programs: A case study in the multidimensional FFT's. In D. Mery, editor, Formal Methods for Parallel Programming: Theory and Applications, pages 80–94. IEEE Computer Society Press, 1997.Google Scholar
  9. 9.
    C. A. Herrmann and C. Lengauer. On the space-time mapping of a class of divideand-conquer recursions. Parallel Processing Letters, 6(4):525–537, 1996.Google Scholar
  10. 10.
    C. A. Herrmann and C. Lengauer. Parallelization of divide-and-conquer by translation to nested loops. Technical Report MIP-9705, Fakultät für Mathematik und Informatik, Universität Passau, March 1997.Google Scholar
  11. 11.
    E. Horowitz and S. Sahni. Fundamentals of Computer Algorithms. Computer Software Engineering Series. Computer Science Press, 1984.Google Scholar
  12. 12.
    C.-H. Huang, J.R. Johnson, and R.W. Johnson. Generating parallel programs from tensor product formulas: A case study of Strassens's matrix multiplication algorithm. In Proc. Int. Conf. on Parallel Processing, volume III, pages 104–108, 1992.Google Scholar
  13. 13.
    S. Kindermann. Flexible program and architecture specification for massively parallel systems. In B. Buchberger and J. Volkert, editors, Parallel Processing: CONPAR 94 — VAPP VI, Lecture Notes in Computer Science 854, pages 160–171. Springer-Verlag, 1994.Google Scholar
  14. 14.
    J. Misra. Powerlist: A structure for parallel recursion. ACM Trans. on Programming Languages and Systems, 16(6):1737–1767, November 1994.Google Scholar
  15. 15.
    Z. G. Mou. Divacon: A parallel language for scientific computing based on divideand-conquer. In Proc. 3rd Symp. Frontiers of Massively Parallel Computation, pages 451–461. IEEE Computer Society Press, October 1990.Google Scholar
  16. 16.
    A. Schönhage and V. Strassen. Schnelle Multiplikation grosser Zahlen. Computing, 7:281–292, 1971.Google Scholar
  17. 17.
    D. B. Skillicorn. Foundations of Parallel Programming. Cambridge International Series on Parallel Computation. Cambridge University Press, 1994.Google Scholar
  18. 18.
    V. Strassen. Gaussian elimination is not optimal. Numerische Mathematik, 13:354–356, 1969.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • Christoph A. Herrmann
    • 1
  • Christian Lengauer
    • 1
  1. 1.Fakultät für Mathematik und InformatikUniversität PassauGermany

Personalised recommendations