Automatic Parallelization of Recursive Procedures
Rent the article at a discountRent now
* Final gross prices may vary according to local VAT.Get Access
Parallelizing compilers have traditionally focussed mainly on parallelizing loops. This paper presents a new framework for automatically parallelizing recursive procedures that typically appear in divide-and-conquer algorithms. We present compile-time analysis, using powerful, symbolic array section analysis, to detect the independence of multiple recursive calls in a procedure. This allows exploitation of a scalable form of nested parallelism, where each parallel task can further spawn off parallel work in subsequent recursive calls. We describe a runtime system which efficiently supports this kind of nested parallelism without unnecessarily blocking tasks. We have implemented this framework in a parallelizing compiler, which is able to automatically parallelize programs like quicksort and mergesort, written in C. For cases where even the advanced compile-time analysis we describe is not able to prove the independence of procedure calls, we propose novel techniques for speculative runtime parallelization, which are more efficient and powerful in this context than analogous techniques proposed previously for speculatively parallelizing loops. Our experimental results on an IBM G30 SMP machine show good speedups obtained by following our approach.
- F. Gustavson, Recursion leads to automatic variable blocking for dense linear-algebra systems, IBM J. Res. Dev., 41(6):737–755 (November 1997).
- E. Elmroth and F. Gustavson, New serial and parallel recursive QR factorization algorithms for SMP systems, Proc. PARA'98 Workshop on Applied Parallel Computing in Large Scale Scientific and Industrial Problems, Umea, Sweden (June 1998).
- T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to Algorithms, The MIT Press (1989).
- L. Rauchwerger, Runtime parallelization: It's time has come, J. Parallel Computing, 24(3-4):527–566 (1998).
- M. Burke and R. Cytron, Interprocedural dependence analysis and parallelization, Proc. SIGPLAN Symp. Compiler Construction, pp. 162-175 (June 1986).
- R. Triolet, F. Irigion, and P. Feautrier, Direct parallelization of call statements, Proc. SIGPLAN Symp. Compiler Construction, pp. 176-185 (June 1986).
- D. Callahan and K. Kennedy, Analysis of interprocedural side-effects in a parallel programming environment, J. Parallel and Distributed Computing, 5:517–550 (1988).
- Z. Li and P. C. Yew, Efficient interprocedural analysis for program parallelization and restructuring, ACM SIGPLAN PPEALS, pp. 85–99 (1988).
- M. W. Hall, S. P. Amarasinghe, B. R. Murphy, S.-W. Liao, and M. S. Lam, Detecting coarse-grain parallelism using an interprocedural parallelizing compiler, Proc. Supercomputing, San Diego, California (December 1995).
- M. Girkar and C. Polychronopoulos, Automatic extraction of functional parallelism from ordinary programs, IEEE Trans. Parallel and Distributed Systems, 3(2):166–178 (March 1992).
- S. Ramaswamy, S. Sapatnekar, and P. Banerjee, A framework for exploiting data and functional parallelism on distributed memory multicomputers, IEEE Trans. Parallel and Distributed Systems, 8(11):1098–1116 (November 1997).
- W. L. Harrison, III, The interprocedural analysis and automatic parallelization of Scheme programs, Lisp and Symbolic Computation: An International Journal, 2 (3) (1989).
- L. Hendren and A. Nicolau, Parallelizing programs with recursive data structures, IEEE Trans. Parallel and Distributed Systems, 1(1):35–47 (January 1990).
- J. R. Larus and P. N. Hilfinger, Restructuring Lisp programs for concurrent execution, Proc. ACM SIGPLAN PPEALS—Parallel Programming: Experience with Applications, Languages and Systems, pp. 100–110 (July 1988).
- R. Blumofe, C. Joerg, B. Kuszmaul, C. Leiserson, K. Randall, and Y. Zhou, Cilk: An efficient multi-threaded runtime system, Proc. ACM SIGPLAN Symp. Principles and Practices of Parallel Programming, Santa Barbara, California (July 1995).
- R. Halstead, Jr., Multilisp: A language for concurrent symbolic computation, ACM Trans. Progr. Lang. Syst., 7(4):501–538 (1985).
- M. Rinard and P. Diniz, Commutativity analysis: A new analysis framework for parallelizing compilers, Proc. ACM SIGPLAN Conf. Progr. Lang. Design and Implementation, Philadelphia, Pennsylvania (May 1996).
- M. Gupta, S. Mukhopadhyay, and N. Sinha, Automatic parallelization of recursive proce-dures, Technical Report RC 21333 (96110) 4 NOV 1998, IBM Research (November 1998).
- R. Rugina and M. Rinard, Automatic parallelization of divide and conquer algorithms, Proc. ACM SIGPLAN Symp. Principles and Practices of Parallel Programming, Atlanta, Georgia (May 1999).
- J. Gu, Z. Li, and G. Lee, Symbolic array dataflow analysis for array privatization and program parallelization, Proc. Supercomputing, San Diego, California (December 1995).
- W. Blume and R. Eigenmann, Demand-driven, symbolic range propagation, Proc. Eight Workshop on Languages and Compilers for Parallel Computing, Columbus, Ohio (August 1995).
- M. Gerlek, E. Stoltz, and M. Wolfe, Beyond induction variables: Detecting and classifying sequences using a demand-driven SSA form, ACM Trans. Progr. Lang. Syst. 17(1):85–122 (January 1995).
- S. Moon and M. Hall, Evaluation of predicated array data-flow analysis for automatic parallelization, Proc. ACM SIGPLAN Symp. Principles and Practices of Parallel Programming, Atlanta, Georgia (May 1999).
- R. L. Kruse, Data Structures and Program Design, Prentice Hall (1989).
- J.-D. Choi, M. Burke, and P. Carini, Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side-effects, 20th Ann. ACM SIGACT-SIGPLAN Symp. Principles Progr. Lang., pp. 232–245 (January 1993).
- M. Emami, R. Ghiya, and L. Hendren, Context-sensitive interprocedural points-to analysis in the presence of function pointers, Proc. ACM SIGPLAN Conf. Progr. Lang. Design and Implementation, pp. 242–256 (June 1994).
- R. P. Wilson and M. S. Lam, Efficient context-sensitive pointer analysis for C programs, SIGPLAN Conf. Progr. Lang. Design and Implementation, pp. 1–12 (June 1995).
- P. Havlak and K. Kennedy, An implementation of interprocedural bounded regular section analysis, IEEE Trans. Parallel and Distributed Systems, 2(3):350-360 (July 1991).
- M. Gupta, E. Schonberg, and H. Srinivasan, A unified framework for optimizing communication in data-parallel programs, IEEE Trans. Parallel and Distributed Systems, 7(7):689–704 (July 1996).
- P. Tu and D. Padua, Gated SSA-based demand-driven symbolic analysis for parallelizing compilers, Proc. Int'l. Conf. Supercomputing, Barcelona, Spain (July 1995).
- S. Hummel and E. Schonberg, Low-overhead scheduling of nested parallelism, IBM J. Res. Dev. (1991).
- M. Gupta and R. Nim, Techniques for speculative runtime parallelization of loops, '98, Orlando, Florida (November 1998).
- D. Patel and L. Rauchwerger, Principles of speculative runtime parallelization, Proc. 11th Workshop on Languages and Compilers for Parallel Computing (August 1998).
- Automatic Parallelization of Recursive Procedures
International Journal of Parallel Programming
Volume 28, Issue 6 , pp 537-562
- Cover Date
- Print ISSN
- Online ISSN
- Kluwer Academic Publishers-Plenum Publishers
- Additional Links
- automatic parallelization
- recursive procedures
- divide and conquer
- symbolic analysis
- array section analysis
- speculative parallelization
- Industry Sectors