Generalized Index-Set Splitting

  • Christopher Barton
  • Arie Tal
  • Bob Blainey
  • José Nelson Amaral
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3443)

Abstract

This paper introduces Index-Set Splitting (ISS), a technique that splits a loop containing several conditional statements into several loops with less complex control flow. Contrary to the classic loop unswitching technique, ISS splits loops when the conditional is loop variant. ISS uses an Index Sub-range Tree (IST) to identify the structure of the conditionals in the loop and to select which conditionals should be eliminated. This decision is based on an estimation of the code growth for each splitting: a greedy algorithm spends a pre-determined code growth budget. ISTs separate the decision about which splits to perform from the actual code generation for the split loops. The use of ISS to improve a loop fusion framework is then discussed. ISS opportunity identification in the SPEC2000 benchmark suite and three other suites demonstrate that ISS is a general technique that may benefit other compilers.

References

  1. 1.
    Cooper, K.D., Torczon, L.: Engineering a Compiler. Morgan Kaufmann, San Francisco (2004)Google Scholar
  2. 2.
    Blainey, B., Barton, C., Amaral, J.N.: Removing impediments to loop fusion through code transformations. In: Workshop on Languages and Compilers for Parallel Computing, College Park, MD (2002)Google Scholar
  3. 3.
    Wolfe, M.: High Performance Compilers for Parallel Computing. Addison Wesley, Longman (1994)Google Scholar
  4. 4.
    Allen, F.E., Cocke, J.: A catalogue of optimizing transformations. In: Rustin, R. (ed.) Design and Optimization of Compilers, pp. 1–30. Prentice-Hall, Englewood Cliffs (1972)Google Scholar
  5. 5.
    Allen, R., Callahan, D., Kennedy, K.: Automatic decomposition of scientific programs for parallel execution. In: Symposium on Principles of Programming Languages, Munich, Germany, pp. 63–76 (1987)Google Scholar
  6. 6.
    Gao, G.R., Olsen, R., Sarkar, V., Thekkath, R.: Collective loop fusion for array contraction. In: Workshop on Languages and Compilers for Parallel Computing, New Haven, Conn., pp. 281–295. Springer, Berlin (1992)Google Scholar
  7. 7.
    Ding, C., Kennedy, K.: The memory bandwidth bottleneck and its amelioration by a compiler. In: International Parallel and Distributed Processing Symposium, Cancun, Mexico, pp. 181–189 (2000)Google Scholar
  8. 8.
    Ding, C., Kennedy, K.: Improving effective bandwidth through compiler enhancement of global cache reuse. In: International Parallel and Distribute Processing Symposium, San Francisco, CA (2001)Google Scholar
  9. 9.
    Kennedy, K., McKinley, K.S.: Maximizing loop parallelism and improving data locality via loop fusion and distribution. In: Workshop on Languages and Compilers for Parallel Computing, Portland, Ore, pp. 301–320 (1993)Google Scholar
  10. 10.
    Singhai, S., McKinley, K.: A parameterized loop fusion algorithm for improving parallelism and cache locality. The Computer Journal 40, 340–355 (1997)CrossRefGoogle Scholar
  11. 11.
    Allen, R., Kennedy, K.: Optimizing Compilers for Modern Architectures. Morgan Kaufmann Publishers, San Francisco (2002)Google Scholar
  12. 12.
    Padua, D.A., Kuck, D.J., Lawrie, D.H.: High-speed multiprocessors and compilation techniques. IEEE Transactions on Computers 29, 763–776 (1980)MATHCrossRefGoogle Scholar
  13. 13.
    Yang, M., Uh, G.R., Whalley, D.B.: Improving performance by branch reordering. In: Programming Language Design and Implementation (PLDI), Montreal, Canada, pp. 130–141 (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Christopher Barton
    • 1
  • Arie Tal
    • 2
  • Bob Blainey
    • 2
  • José Nelson Amaral
    • 1
  1. 1.Department of Computing ScienceUniversity of AlbertaEdmontonCanada
  2. 2.IBM Toronto Software LaboratoryTorontoCanada

Personalised recommendations