Skip to main content

Optimal reordering and mapping of a class of nested-loops for parallel execution

  • Parallelizing Compilers
  • Conference paper
  • First Online:
Languages and Compilers for Parallel Computing (LCPC 1996)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1239))


This paper addresses the compile-time optimization of a class of nested-loop computations that arise in some computational physics applications. The computations involve summations over products of array terms in order to compute multi-dimensional surface and volume integrals. Reordering additions and multiplications and applying the distributive law can significantly reduce the number of operations required in evaluating these summations. In a multiprocessor environment, proper distribution of the arrays among processors will reduce the inter-processor communication time. We present a formal description of the operation minimization problem, a proof of its NP-completeness, and a pruning strategy for finding the optimal solution in small cases. We also give an algorithm for determining the optimal distribution of the arrays among processors in a multiprocessor environment.

Supported in part by NSF grant DMR-9520319.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. C. N. Fischer and R. J. Leblanc Jr. Crafting a Compiler. Menlo Park, CA: Benjamin/ Cummings, 1991.

    Google Scholar 

  2. Michael R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. New York: W. H. Freeman, 1979.

    Google Scholar 

  3. Ken Kennedy and Kathryn S. McKinley. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution. In Languages and Compilers for Parallel Computing, August 1993, 301–320.

    Google Scholar 

  4. Ken Kennedy and Kathryn S. McKinley. Optimizing for Parallelism and Data Locality. In Proceedings of the 1992 ACM International Conference on Supercomputing, July 1992, 323–334.

    Google Scholar 

  5. V. Kumar, A. Grama, A. Gupta, and G. Karypis. Introduction to Parallel Computing: Design and Analysis of Algorithms. RedWood City, CA: Benjamin/Cummings, 1994.

    Google Scholar 

  6. C. C. Lu and W. C. Chew. Fast Algorithm for Solving Hybrid Integral Equations. In IEE Proceedings-H, 140(6): 455–460, December 1993.

    Google Scholar 

  7. Edmund K. Miller. Solving Bigger Problems-By Decreasing the Operation Count and Increasing the Computation Bandwidth. In Proceedings of the IEEE, 79(10): 1493–1504, October 1991.

    Article  Google Scholar 

  8. M. Potkonjak, M. B. Srivastava, and A. P. Chandrakasan. Multiple Constant Multiplications: Efficient and Versatile Framework and Algorithms for Exploring Common Subexpression Elimination. IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems, 15(2): 151–164, February 1996.

    Article  Google Scholar 

  9. S. Winograd. Arithmetic complexity of computations. Philadelphia: Society for Industrial and Applied Mathematics, 1980.

    Google Scholar 

  10. M. Wolfe. High Performance Compilers for Parallel Computing. Addison Wesley, 1996.

    Google Scholar 

  11. Michael E. Wolf and Monica S. Lam. A Data Locality Algorithm. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, June 1991, 30–44.

    Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

David Sehr Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lam, CC., Sadayappan, P., Wenger, R. (1997). Optimal reordering and mapping of a class of nested-loops for parallel execution. In: Sehr, D., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1996. Lecture Notes in Computer Science, vol 1239. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63091-3

  • Online ISBN: 978-3-540-69128-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics