Skip to main content

A Case for Combining Compile-Time and Run-Time Parallelization

  • Conference paper
  • First Online:
Languages, Compilers, and Run-Time Systems for Scalable Computers (LCR 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1511))

Abstract

This paper demonstrates that significant improvements to automatic parallelization technology require that existing systems be extended in two ways: (1) they must combine high-quality compile-time analysis with low-cost run-time testing; and, (2) they must take control flow into account during analysis. We support this claim with the results of an experiment that measures the safety of parallelization at run time for loops left unparallelized by the Stanford SUIF compiler’s automatic parallelization system. We present results of measurements on programs from two benchmark suites — Specfp95 and Nas sample benchmarks — which identify inherently parallel loops in these programs that are missed by the compiler. We characterize remaining parallelization opportunities, and find that most of the loops require run-time testing, analysis of control flow, or some combination of the two. We present a new compile-time analysis technique that can be used to parallelize most of these remaining parallel loops. This technique is designed to not only improve the results of compile-time parallelization, but also to produce low-cost, directed run-time tests that allow the system to defer binding of parallelization until run-time when safety cannot be proven statically. We call this approach predicated array data-flow analysis. We augment array data-flow analysis, which the compiler uses to identify independent and privatizable arrays, by associating with each array data-flow value a predicate. Predicated array data-flow analysis allows the compiler to derive “optimistic” data-flow values guarded by predicates; these predicates can be used to derive a run-time test guaranteeing the safety of parallelization.

This work has been supported by DARPA Contract DABT63-95-C-0118, a fellowship from AT&T Bell Laboratories, the Air Force Materiel Command and DARPA contract F30602-95-C-0098.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ammons, G., and Larus, J.R. Improving data-flow analysis with path profiles. In Proceedings of the ACM SIGPLAN’ 98 Conference on Programming Language Design and Implementation (Montreal, Canada, June 1998), pp. 72–84.

    Google Scholar 

  2. Blume, W., Doallo, R., Eigenmann, R., Grout, J., Hoeflinger, J., Lawrence, T., Lee, J., Padua, D., Paek, Y., Pottenger, B., Rauchwerger, L., and Tu, P. Parallel programming with Polaris. IEEE Computer 29, 12 (December 1996), 78–82.

    Google Scholar 

  3. Blume, W., and Eigenmann, R. Performance analysis of parallelizing compilers on the Perfect Benchmark programs. IEEE Transaction on Parallel and Distributed Systems 3, 6 (November 1992), 643–656.

    Article  Google Scholar 

  4. Blume, W.J. Symbolic Analysis Techniques for Effective Automatic Parallelization. PhD thesis, Dept. of Computer Science, University of Illinois at Urbana-Champaign, June 1995.

    Google Scholar 

  5. Bodík, R., Gupta, R., and Soffa, M. L. Interprocedural conditional branch elimination. In Proceedings of the ACM SIGPLAN’ 97 Conference on Programming Language Design and Implementation (Las Vegas, Nevada, June 1997), pp. 146–158.

    Google Scholar 

  6. Cousot, P., and Cusot, R. Systematic design of program anaysis frameworks. In Conference Record of the Sixth Annual ACM Symposium on Principles of Programming Languages (San Antonio, Texas, January 1979), pp. 269–282.

    Google Scholar 

  7. Goff, G. Practical techniques to augment dependence analysis in the presence of symbolic terms. Tech. Rep. TR92-194, Dept. of Computer Science, Rice University, October 1992.

    Google Scholar 

  8. Gu, J., Li, Z., and Lee, G. Symbolic array dataflow analysis for array privatization and program parallelization. In Proceedings of Supercomputing’ 95 (San Diego, California, December 1995).

    Google Scholar 

  9. Gu, J., Li, Z., and Lee, G. Experience with efficient array data-flow analysis for array privatization. In Proceedings of the Sixth ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming (Las Vegas, Nevada, June 1997), pp. 157–167.

    Google Scholar 

  10. Haghighat, M. R.Symbolic Analysis for Parallelizing Compilers. PhD thesis, Dept. of Computer Science, University of Illinois at Urbana-Champaign, August 1994.

    Google Scholar 

  11. Hall, M. W., Anderson, J. M., Amarasinghe, S. P., Murphy, B. R., Liao, S.-W., Bugnion, E., and Lam, M. S. Maximizing multiprocessor performance with the SUIF compiler. IEEE Computer 29, 12 (December 1996), 84–89.

    Google Scholar 

  12. Hall, M. W., Murphy, B. R., Amarasinghe, S. P., Liao, S.-W., and Lam, M. S. Interprocedural analysis for parallelization. In Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing (Columbus, Ohio, August 1995), pp. 61–80.

    Google Scholar 

  13. Holley, L. H., and Rosen, B. K. Qualified data flow problems. In Conference Record of the Seventh Annual ACM Symposium on Principles of Programming Languages (Las Vegas, Nevada, January 1980), pp. 68–82.

    Google Scholar 

  14. Irigoin, F. Interprocedural analyses for programming environments. In Proceedings of the NSF-CNRS Workshop on Environment and Tools for Parallel Scientific Programming (September 1992).

    Google Scholar 

  15. Moon, S., Hall, M.W., and Murphy, B. R. Predicated array data-flow analysis for run-time parallelization. In Proceedings of the 1998 ACM International Conference on Supercomputing (Melbourne, Australia, July 1998).

    Google Scholar 

  16. Nielson, F. Expected forms of data flow analysis. In Programs as Data Objects, H. Ganzinger and N. D. Jones, Eds., vol. 217 of Lecture Notes on Computer Science. Springer-Verlag, October 1986, pp. 172–191.

    Google Scholar 

  17. Pugh, W., and Wonnacott, D. Eliminating false data dependences using the Omega test. In Proceedings of the ACM SIGPLAN’ 92 Conference on Programming Language Design and Implementation (San Francisco, California, June 1992), pp. 140–151.

    Google Scholar 

  18. Rauchwerger, L., and Padua, D. The LRPD test: Speculative run-time parallelization of loops with privatization and reduction parallelization. In Proceedings of the ACM SIGPLAN’ 95 Conference on Programming Language Design and Implementation (La Jolla, California, June 1995), pp. 218–232.

    Google Scholar 

  19. Saltz, J. H., Mirchandaney, R., and Crowley, K. Run-time parallelization and scheduling of loops. IEEE Transaction on Computers 40, 5 (May 1991), 603–612.

    Article  Google Scholar 

  20. Sharma, S. D., Acharya, A., and SAltz, J. Defeered data-flow analysis: Algorithms, proofs and applications. Tech. Rep. UMD-CS-TR-3845, Dept. of Computer Science, University of Maryland, November 1997.

    Google Scholar 

  21. Singh, J. P., and Hennessy, J. L. An empirical investigation of the effectiveness and limitations of automatic parallelization. In Proceedings of the International Symposium on Shared Memory Multiprocessing (April 1991).

    Google Scholar 

  22. So, B., Moon, S., and Hall, M.W. Measuring the effectiveness of automatic parallelization in SUIF. In Proceedings of the 1998 ACM International Conference on Supercomputing (Melbourne, Australia, July 1998).

    Google Scholar 

  23. Strom, R. E., and Yellin, D. M. Extending typestate checking using conditional liveness analysis. IEEE Transaction on Software Engineering 19, 5 (May 1993), 478–485.

    Article  Google Scholar 

  24. Tu, P.Automatic Array Privatization and Demand-driven Symbolic Analysis. PhD thesis, Dept. of Computer Science, University of Illinois at Urbana-Champaign, May 1995.

    Google Scholar 

  25. Tu, P., and Padua, D. Automatic array privatization. In Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing (Portland, Oregon, August 1993), pp. 500–521.

    Google Scholar 

  26. Wegman, M. N., and Zadeck, F. K. Constant propagation with conditional branches. ACM Transaction on Programming Languages and Systems 13, 2 (April 1991), 180–210.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Moon, S., So, B., Hall, M.W., Murphy, B. (1998). A Case for Combining Compile-Time and Run-Time Parallelization. In: O’Hallaron, D.R. (eds) Languages, Compilers, and Run-Time Systems for Scalable Computers. LCR 1998. Lecture Notes in Computer Science, vol 1511. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49530-4_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-49530-4_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65172-7

  • Online ISBN: 978-3-540-49530-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics