Skip to main content

Recursion Unrolling for Divide and Conquer Programs

  • Conference paper
  • First Online:
Book cover Languages and Compilers for Parallel Computing (LCPC 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2017))

Abstract

This paper presents recursion unrolling, a technique for improving the performance of recursive computations. Conceptually, recursion unrolling inlines recursive calls to reduce control flow overhead and increase the size of the basic blocks in the computation, which in turn increases the effectiveness of standard compiler optimizations such as register allocation and instruction scheduling. We have identified two transformations that significantly improve the effectiveness of the basic recursion unrolling technique. Conditional fusion merges conditionals with identical expressions, considerably simplifying the control flow in unrolled procedures. Recursion re-rolling rolls back the recursive part of the procedure to ensure that a large unrolled base case is always executed, regardless of the input problem size.

We have implemented our techniques and applied them to an important class of recursive programs, divide and conquer programs. Our experimental results show that recursion unrolling can improve the performance of our programs by a factor of between 3.6 to 10.8 depending on the combination of the program and the architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Amarasinghe, J. Anderson, M. Lam, and A. Lim. An overview of a compiler for scalable parallel machines. In Proceedings of the Sixth Workshop on Languages and Compilers for Parallel Computing, Portland, OR, August 1993.

    Google Scholar 

  2. Andrew W. Appel. Unrolling recursion saves space. Technical report CS-TR-363-92, Princeton University, March 1992.

    Google Scholar 

  3. C. Chambers and D. Ungar. Customization: Optimizing compiler technology for SELF, a dynamically-typed object-oriented programming language. In Proceedings of the SIGPLAN’ 89 Conference on Program Language Design and Implementation, Portland, OR, June 1989. ACM, New York.

    Google Scholar 

  4. P. Chang, S. Mahlke, W. Chen, and W. Hwu. Profile-guided automatic inline expansion for C programs. Software-Practice and Experience, 22(5):349–369, May 1992.

    Article  Google Scholar 

  5. S. Chatterjee, A. Lebeck, P. Patnala, and M. Thottethodi. Recursive array layouts and fast matrix multiplication. In Proceedings of the 11th Annual ACM Symposium on Parallel Algorithms and Architectures, Saint Malo, France, June 1999.

    Google Scholar 

  6. K. Cooper, M. W. Hall, and L. Torczon. An experiment with inline substitution. Software-Practice and Experience, 21(6):581–601, June 1991.

    Article  Google Scholar 

  7. J. W. Davidson and A. M. Holler. A study of a C function inliner. Software Practice and Experience, 18(8):775–790, August 1988.

    Article  Google Scholar 

  8. J. Frens and D. Wise. Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code. In Proceedings of the 6th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Las Vegas, NV, June 1997.

    Google Scholar 

  9. M. Frigo, C. Leiserson, and K. Randall. The implementation of the Cilk-5 multithreaded language. In Proceedings of the SIGPLAN’98 Conference on Program Language Design and Implementation, Montreal, Canada, June 1998.

    Google Scholar 

  10. F. Gustavson. Recursion leads to automatic variable blocking for dense linear algebra algorithms. IBM Journal of Research and Development, 41(6):737–755, November 1997.

    Article  Google Scholar 

  11. S. Richardson and M. Ganapathi. Interprocedural analysis versus procedure integration. Information Processing Letters, 32(3):137–142, August 1989.

    Article  Google Scholar 

  12. R. Scheifler. An analysis of inline substitution for a structured programming language. Commun. ACM, 20(9), September 1977.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rugina, R., Rinard, M. (2001). Recursion Unrolling for Divide and Conquer Programs. In: Midkiff, S.P., et al. Languages and Compilers for Parallel Computing. LCPC 2000. Lecture Notes in Computer Science, vol 2017. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45574-4_3

Download citation

  • DOI: https://doi.org/10.1007/3-540-45574-4_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42862-6

  • Online ISBN: 978-3-540-45574-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics