Skip to main content
Log in

Automatic Parallelization of Recursive Procedures

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Parallelizing compilers have traditionally focussed mainly on parallelizing loops. This paper presents a new framework for automatically parallelizing recursive procedures that typically appear in divide-and-conquer algorithms. We present compile-time analysis, using powerful, symbolic array section analysis, to detect the independence of multiple recursive calls in a procedure. This allows exploitation of a scalable form of nested parallelism, where each parallel task can further spawn off parallel work in subsequent recursive calls. We describe a runtime system which efficiently supports this kind of nested parallelism without unnecessarily blocking tasks. We have implemented this framework in a parallelizing compiler, which is able to automatically parallelize programs like quicksort and mergesort, written in C. For cases where even the advanced compile-time analysis we describe is not able to prove the independence of procedure calls, we propose novel techniques for speculative runtime parallelization, which are more efficient and powerful in this context than analogous techniques proposed previously for speculatively parallelizing loops. Our experimental results on an IBM G30 SMP machine show good speedups obtained by following our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  1. F. Gustavson, Recursion leads to automatic variable blocking for dense linear-algebra systems, IBM J. Res. Dev., 41(6):737–755 (November 1997).

    Google Scholar 

  2. E. Elmroth and F. Gustavson, New serial and parallel recursive QR factorization algorithms for SMP systems, Proc. PARA'98 Workshop on Applied Parallel Computing in Large Scale Scientific and Industrial Problems, Umea, Sweden (June 1998).

  3. T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to Algorithms, The MIT Press (1989).

  4. L. Rauchwerger, Runtime parallelization: It's time has come, J. Parallel Computing, 24(3-4):527–566 (1998).

    Google Scholar 

  5. M. Burke and R. Cytron, Interprocedural dependence analysis and parallelization, Proc. SIGPLAN Symp. Compiler Construction, pp. 162-175 (June 1986).

  6. R. Triolet, F. Irigion, and P. Feautrier, Direct parallelization of call statements, Proc. SIGPLAN Symp. Compiler Construction, pp. 176-185 (June 1986).

  7. D. Callahan and K. Kennedy, Analysis of interprocedural side-effects in a parallel programming environment, J. Parallel and Distributed Computing, 5:517–550 (1988).

    Google Scholar 

  8. Z. Li and P. C. Yew, Efficient interprocedural analysis for program parallelization and restructuring, ACM SIGPLAN PPEALS, pp. 85–99 (1988).

  9. M. W. Hall, S. P. Amarasinghe, B. R. Murphy, S.-W. Liao, and M. S. Lam, Detecting coarse-grain parallelism using an interprocedural parallelizing compiler, Proc. Supercomputing, San Diego, California (December 1995).

  10. M. Girkar and C. Polychronopoulos, Automatic extraction of functional parallelism from ordinary programs, IEEE Trans. Parallel and Distributed Systems, 3(2):166–178 (March 1992).

    Google Scholar 

  11. S. Ramaswamy, S. Sapatnekar, and P. Banerjee, A framework for exploiting data and functional parallelism on distributed memory multicomputers, IEEE Trans. Parallel and Distributed Systems, 8(11):1098–1116 (November 1997).

    Google Scholar 

  12. W. L. Harrison, III, The interprocedural analysis and automatic parallelization of Scheme programs, Lisp and Symbolic Computation: An International Journal, 2 (3) (1989).

  13. L. Hendren and A. Nicolau, Parallelizing programs with recursive data structures, IEEE Trans. Parallel and Distributed Systems, 1(1):35–47 (January 1990).

    Google Scholar 

  14. J. R. Larus and P. N. Hilfinger, Restructuring Lisp programs for concurrent execution, Proc. ACM SIGPLAN PPEALS—Parallel Programming: Experience with Applications, Languages and Systems, pp. 100–110 (July 1988).

  15. R. Blumofe, C. Joerg, B. Kuszmaul, C. Leiserson, K. Randall, and Y. Zhou, Cilk: An efficient multi-threaded runtime system, Proc. ACM SIGPLAN Symp. Principles and Practices of Parallel Programming, Santa Barbara, California (July 1995).

  16. R. Halstead, Jr., Multilisp: A language for concurrent symbolic computation, ACM Trans. Progr. Lang. Syst., 7(4):501–538 (1985).

    Google Scholar 

  17. M. Rinard and P. Diniz, Commutativity analysis: A new analysis framework for parallelizing compilers, Proc. ACM SIGPLAN Conf. Progr. Lang. Design and Implementation, Philadelphia, Pennsylvania (May 1996).

  18. M. Gupta, S. Mukhopadhyay, and N. Sinha, Automatic parallelization of recursive proce-dures, Technical Report RC 21333 (96110) 4 NOV 1998, IBM Research (November 1998).

  19. R. Rugina and M. Rinard, Automatic parallelization of divide and conquer algorithms, Proc. ACM SIGPLAN Symp. Principles and Practices of Parallel Programming, Atlanta, Georgia (May 1999).

  20. J. Gu, Z. Li, and G. Lee, Symbolic array dataflow analysis for array privatization and program parallelization, Proc. Supercomputing, San Diego, California (December 1995).

  21. W. Blume and R. Eigenmann, Demand-driven, symbolic range propagation, Proc. Eight Workshop on Languages and Compilers for Parallel Computing, Columbus, Ohio (August 1995).

  22. M. Gerlek, E. Stoltz, and M. Wolfe, Beyond induction variables: Detecting and classifying sequences using a demand-driven SSA form, ACM Trans. Progr. Lang. Syst. 17(1):85–122 (January 1995).

    Google Scholar 

  23. S. Moon and M. Hall, Evaluation of predicated array data-flow analysis for automatic parallelization, Proc. ACM SIGPLAN Symp. Principles and Practices of Parallel Programming, Atlanta, Georgia (May 1999).

  24. R. L. Kruse, Data Structures and Program Design, Prentice Hall (1989).

  25. J.-D. Choi, M. Burke, and P. Carini, Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side-effects, 20th Ann. ACM SIGACT-SIGPLAN Symp. Principles Progr. Lang., pp. 232–245 (January 1993).

  26. M. Emami, R. Ghiya, and L. Hendren, Context-sensitive interprocedural points-to analysis in the presence of function pointers, Proc. ACM SIGPLAN Conf. Progr. Lang. Design and Implementation, pp. 242–256 (June 1994).

  27. R. P. Wilson and M. S. Lam, Efficient context-sensitive pointer analysis for C programs, SIGPLAN Conf. Progr. Lang. Design and Implementation, pp. 1–12 (June 1995).

  28. P. Havlak and K. Kennedy, An implementation of interprocedural bounded regular section analysis, IEEE Trans. Parallel and Distributed Systems, 2(3):350-360 (July 1991).

    Google Scholar 

  29. M. Gupta, E. Schonberg, and H. Srinivasan, A unified framework for optimizing communication in data-parallel programs, IEEE Trans. Parallel and Distributed Systems, 7(7):689–704 (July 1996).

    Google Scholar 

  30. P. Tu and D. Padua, Gated SSA-based demand-driven symbolic analysis for parallelizing compilers, Proc. Int'l. Conf. Supercomputing, Barcelona, Spain (July 1995).

  31. S. Hummel and E. Schonberg, Low-overhead scheduling of nested parallelism, IBM J. Res. Dev. (1991).

  32. M. Gupta and R. Nim, Techniques for speculative runtime parallelization of loops, '98, Orlando, Florida (November 1998).

  33. D. Patel and L. Rauchwerger, Principles of speculative runtime parallelization, Proc. 11th Workshop on Languages and Compilers for Parallel Computing (August 1998).

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gupta, M., Mukhopadhyay, S. & Sinha, N. Automatic Parallelization of Recursive Procedures. International Journal of Parallel Programming 28, 537–562 (2000). https://doi.org/10.1023/A:1007560600904

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1007560600904

Navigation