International Journal of Parallel Programming

, Volume 28, Issue 6, pp 537–562 | Cite as

Automatic Parallelization of Recursive Procedures

  • Manish Gupta
  • Sayak Mukhopadhyay
  • Navin Sinha
Article

Abstract

Parallelizing compilers have traditionally focussed mainly on parallelizing loops. This paper presents a new framework for automatically parallelizing recursive procedures that typically appear in divide-and-conquer algorithms. We present compile-time analysis, using powerful, symbolic array section analysis, to detect the independence of multiple recursive calls in a procedure. This allows exploitation of a scalable form of nested parallelism, where each parallel task can further spawn off parallel work in subsequent recursive calls. We describe a runtime system which efficiently supports this kind of nested parallelism without unnecessarily blocking tasks. We have implemented this framework in a parallelizing compiler, which is able to automatically parallelize programs like quicksort and mergesort, written in C. For cases where even the advanced compile-time analysis we describe is not able to prove the independence of procedure calls, we propose novel techniques for speculative runtime parallelization, which are more efficient and powerful in this context than analogous techniques proposed previously for speculatively parallelizing loops. Our experimental results on an IBM G30 SMP machine show good speedups obtained by following our approach.

automatic parallelization recursive procedures divide and conquer symbolic analysis array section analysis speculative parallelization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

REFERENCES

  1. 1.
    F. Gustavson, Recursion leads to automatic variable blocking for dense linear-algebra systems, IBM J. Res. Dev., 41(6):737–755 (November 1997).Google Scholar
  2. 2.
    E. Elmroth and F. Gustavson, New serial and parallel recursive QR factorization algorithms for SMP systems, Proc. PARA'98 Workshop on Applied Parallel Computing in Large Scale Scientific and Industrial Problems, Umea, Sweden (June 1998).Google Scholar
  3. 3.
    T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to Algorithms, The MIT Press (1989).Google Scholar
  4. 4.
    L. Rauchwerger, Runtime parallelization: It's time has come, J. Parallel Computing, 24(3-4):527–566 (1998).Google Scholar
  5. 5.
    M. Burke and R. Cytron, Interprocedural dependence analysis and parallelization, Proc. SIGPLAN Symp. Compiler Construction, pp. 162-175 (June 1986).Google Scholar
  6. 6.
    R. Triolet, F. Irigion, and P. Feautrier, Direct parallelization of call statements, Proc. SIGPLAN Symp. Compiler Construction, pp. 176-185 (June 1986).Google Scholar
  7. 7.
    D. Callahan and K. Kennedy, Analysis of interprocedural side-effects in a parallel programming environment, J. Parallel and Distributed Computing, 5:517–550 (1988).Google Scholar
  8. 8.
    Z. Li and P. C. Yew, Efficient interprocedural analysis for program parallelization and restructuring, ACM SIGPLAN PPEALS, pp. 85–99 (1988).Google Scholar
  9. 9.
    M. W. Hall, S. P. Amarasinghe, B. R. Murphy, S.-W. Liao, and M. S. Lam, Detecting coarse-grain parallelism using an interprocedural parallelizing compiler, Proc. Supercomputing, San Diego, California (December 1995).Google Scholar
  10. 10.
    M. Girkar and C. Polychronopoulos, Automatic extraction of functional parallelism from ordinary programs, IEEE Trans. Parallel and Distributed Systems, 3(2):166–178 (March 1992).Google Scholar
  11. 11.
    S. Ramaswamy, S. Sapatnekar, and P. Banerjee, A framework for exploiting data and functional parallelism on distributed memory multicomputers, IEEE Trans. Parallel and Distributed Systems, 8(11):1098–1116 (November 1997).Google Scholar
  12. 12.
    W. L. Harrison, III, The interprocedural analysis and automatic parallelization of Scheme programs, Lisp and Symbolic Computation: An International Journal, 2 (3) (1989).Google Scholar
  13. 13.
    L. Hendren and A. Nicolau, Parallelizing programs with recursive data structures, IEEE Trans. Parallel and Distributed Systems, 1(1):35–47 (January 1990).Google Scholar
  14. 14.
    J. R. Larus and P. N. Hilfinger, Restructuring Lisp programs for concurrent execution, Proc. ACM SIGPLAN PPEALS—Parallel Programming: Experience with Applications, Languages and Systems, pp. 100–110 (July 1988).Google Scholar
  15. 15.
    R. Blumofe, C. Joerg, B. Kuszmaul, C. Leiserson, K. Randall, and Y. Zhou, Cilk: An efficient multi-threaded runtime system, Proc. ACM SIGPLAN Symp. Principles and Practices of Parallel Programming, Santa Barbara, California (July 1995).Google Scholar
  16. 16.
    R. Halstead, Jr., Multilisp: A language for concurrent symbolic computation, ACM Trans. Progr. Lang. Syst., 7(4):501–538 (1985).Google Scholar
  17. 17.
    M. Rinard and P. Diniz, Commutativity analysis: A new analysis framework for parallelizing compilers, Proc. ACM SIGPLAN Conf. Progr. Lang. Design and Implementation, Philadelphia, Pennsylvania (May 1996).Google Scholar
  18. 18.
    M. Gupta, S. Mukhopadhyay, and N. Sinha, Automatic parallelization of recursive proce-dures, Technical Report RC 21333 (96110) 4 NOV 1998, IBM Research (November 1998).Google Scholar
  19. 19.
    R. Rugina and M. Rinard, Automatic parallelization of divide and conquer algorithms, Proc. ACM SIGPLAN Symp. Principles and Practices of Parallel Programming, Atlanta, Georgia (May 1999).Google Scholar
  20. 20.
    J. Gu, Z. Li, and G. Lee, Symbolic array dataflow analysis for array privatization and program parallelization, Proc. Supercomputing, San Diego, California (December 1995).Google Scholar
  21. 21.
    W. Blume and R. Eigenmann, Demand-driven, symbolic range propagation, Proc. Eight Workshop on Languages and Compilers for Parallel Computing, Columbus, Ohio (August 1995).Google Scholar
  22. 22.
    M. Gerlek, E. Stoltz, and M. Wolfe, Beyond induction variables: Detecting and classifying sequences using a demand-driven SSA form, ACM Trans. Progr. Lang. Syst. 17(1):85–122 (January 1995).Google Scholar
  23. 23.
    S. Moon and M. Hall, Evaluation of predicated array data-flow analysis for automatic parallelization, Proc. ACM SIGPLAN Symp. Principles and Practices of Parallel Programming, Atlanta, Georgia (May 1999).Google Scholar
  24. 24.
    R. L. Kruse, Data Structures and Program Design, Prentice Hall (1989).Google Scholar
  25. 25.
    J.-D. Choi, M. Burke, and P. Carini, Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side-effects, 20th Ann. ACM SIGACT-SIGPLAN Symp. Principles Progr. Lang., pp. 232–245 (January 1993).Google Scholar
  26. 26.
    M. Emami, R. Ghiya, and L. Hendren, Context-sensitive interprocedural points-to analysis in the presence of function pointers, Proc. ACM SIGPLAN Conf. Progr. Lang. Design and Implementation, pp. 242–256 (June 1994).Google Scholar
  27. 27.
    R. P. Wilson and M. S. Lam, Efficient context-sensitive pointer analysis for C programs, SIGPLAN Conf. Progr. Lang. Design and Implementation, pp. 1–12 (June 1995).Google Scholar
  28. 28.
    P. Havlak and K. Kennedy, An implementation of interprocedural bounded regular section analysis, IEEE Trans. Parallel and Distributed Systems, 2(3):350-360 (July 1991).Google Scholar
  29. 29.
    M. Gupta, E. Schonberg, and H. Srinivasan, A unified framework for optimizing communication in data-parallel programs, IEEE Trans. Parallel and Distributed Systems, 7(7):689–704 (July 1996).Google Scholar
  30. 30.
    P. Tu and D. Padua, Gated SSA-based demand-driven symbolic analysis for parallelizing compilers, Proc. Int'l. Conf. Supercomputing, Barcelona, Spain (July 1995).Google Scholar
  31. 31.
    S. Hummel and E. Schonberg, Low-overhead scheduling of nested parallelism, IBM J. Res. Dev. (1991).Google Scholar
  32. 32.
    M. Gupta and R. Nim, Techniques for speculative runtime parallelization of loops, '98, Orlando, Florida (November 1998).Google Scholar
  33. 33.
    D. Patel and L. Rauchwerger, Principles of speculative runtime parallelization, Proc. 11th Workshop on Languages and Compilers for Parallel Computing (August 1998).Google Scholar

Copyright information

© Plenum Publishing Corporation 2000

Authors and Affiliations

  • Manish Gupta
    • 1
  • Sayak Mukhopadhyay
    • 2
  • Navin Sinha
    • 3
  1. 1.T. J. Watson Research CenterIBMYorktown Heights
  2. 2.Mobius Management SystemsRye
  3. 3.IBM Global Services IndiaBangaloreIndia

Personalised recommendations