Skip to main content
Log in

Beyond unimodular transformations

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

This paper presents an approach to modeling loop transformations using linear algebra. Compound transformations are modeled as integer matrices. The nonsingular linear transformations presented here subsume the class of unimodular transformations. The loop transformations included are the unimodular transformations-reversal, skewing, and permutation- and a new transformation, namelystretching. Nonunimodular transformations (with determinant ≥ 1) create “holes” in the transformed iteration space, rendering code generation difficult. We solve this problem by suitably changing the step size of loops in order to “skip” these holes when traversing the transformed iteration space. For the class of nonunimodular loop transformations, we present algorithms for deriving the loop bounds, the array access expressions, and the step sizes of loops in the nest. To derive the step sizes, we compute the Hermite normal form of the transformation matrix; the step sizes are the entries on the diagonal of this matrix. We then use the theory of Hessenberg matrices in the derivation of exact loop bounds for nonunimodular transformations. We illustrate the use of this approach in several problems such as the generation of tile sets and distributed-memory code generation. This approach provides a framework for optimizing programs for a variety of architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Allen, J.R., and Kennedy, K. 1987. Automatic translation of FORTRAN programs to vector form.ACM Trans. Programming Languages and Systems, 9, 4 (Oct.): 491–542.

    Google Scholar 

  • Ancourt, C., and Irigoin, F. 1991. Scanning polyhedra with DO loops. InProc., Third ACM SIGPLAN Symp. on the Principles & Practice of Parallel Programming (PPoPP), pp. 39–50.

  • Banerjee, U. 1988.Dependence Analysis for Supercomputing. Kluwer Academic, Boston.

    Google Scholar 

  • Banerjee, U. 1991. Inimodular transformations of double loops. InAdvances in Languages and Compilers for Parallel Processing (A. Nicolau et al., eds.), MIT Press, pp. 192–219.

  • Barnett, M., and Lengauer, C. 1992. Loop parallelization and unimodularity. Rept. ECS-LFCS-92-197, Univ. of Edinburgh.

  • Dowling, M. 1988. Optimal code parallelisation using unimodular transformations. InPreprints in Optimization, Carolo-Wilhelmina Universität zu Braunschweig, Germany.

    Google Scholar 

  • Heller, D. 1974. A determinant theorem with applications to parallel algorithms.SIAM J. Num. Anal., 11: 484–496.

    Google Scholar 

  • Hiranandani, S., Kennedy, K., and Tseng, C. 1991. Compiler optimization for Fortran D on MIMD distributed memory machines. InProc. Supercomputing '91. pp. 86–100.

  • Irigoin, F., and Triolet, R. 1988. Supernode partitioning. InProc., 15th Annual ACM Symp. on the Principles of Programming Languages (San Diego, Jan.), pp. 319–329.

  • Li, W., and Pingali, K. 1992. A singular loop transformation framework based on non-singular matrices. InProc., 5th Workshop on Languages and Compilers for Parallel Computing.

  • Lu, L. 1991. A unified framework for systematic loop transformations. InProc., Third ACM SIGPLAN Symp. on the Principles & Practice of Parallel Programming (PPoPP), pp. 28–38.

  • Nemhauser, G., and Wolsey, L. 1988.Integer and Combinatorial Optimization. Wiley, New York.

    Google Scholar 

  • Ramanujam, J. 1990. Compile-time techniques for parallel execution of loops on distributed memory multiprocessors. Ph.D. thesis, Ohio State Univ., Columbus, Oh.

    Google Scholar 

  • Ramanujam, J. 1992. A linear algebraic view of loop transformations and their interaction. InProc., Fifth SIAM Conf. on Parallel Processing for Scientific Computing, pp. 543–548.

  • Ramanujam, J. 1994. Efficient code generation for loop transformations. Tech. Rept. TR-94-08-03, Dept. of Electr. and Comp. Eng., La. State Univ., Baton Rouge, La.

    Google Scholar 

  • Ramanujam, J., and Sadayappan, P. 1992. Tiling multidimensional iteration spaces for nonshared memory machines.J. Parallel and Distr. Comp., 16, 2 (Oct.): 108–120.

    Google Scholar 

  • Schreiber, R., and Dongarra, J. 1990. Automatic blocking of nested loops. Tech. rept., Univ. of Tenm., Knoxville (Aug.).

    Google Scholar 

  • Schrijver, A. 1986.Theory of Linear and Integer Programming. Wiley, New York.

    Google Scholar 

  • Wolf, M., and Lam, M. 1991. A loop transformation theory and an algorithm to maximize parallelism.IEEE Trans. Parallel and Distr. Syst., 2, 4 (Oct.): 452–471.

    Google Scholar 

  • Wolfe, M. 1989a. More iteration space tiling. InProc., Supercomputing '89, pp. 655–664.

  • Wolfe, M. 1989b.Optimizing Supercompilers for Supercomputers. MIT Press, Cambridge, Mass.

    Google Scholar 

  • Wolfe, M. 1990. Massive parallelism through program restructuring. Tech. Rept. CS/E90-009, Oregon Graduate Institute (June).

  • Wolfe, M., and Tseng, C. 1992. The power test for data dependence.IEEE Trans. Parallel and Distr. Syst., 3, 5 (Sept.): 591–601.

    Google Scholar 

  • Zima, H., and Chapman, B. 1990.Supercompilers for Parallel and Vector Supercomputers, ACM Press Frontier Series.

Download references

Author information

Authors and Affiliations

Authors

Additional information

Supported in part by an NSF Young Investigator Award CCR-9457768, an NSF grant CCR-9210422, and by the Louisiana Board of Regents through contract LEQSF (1991–94)-RD-A-09.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ramanujam, J. Beyond unimodular transformations. J Supercomput 9, 365–389 (1995). https://doi.org/10.1007/BF01206273

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01206273

Keywords

Navigation