Determining transformation sequences for loop parallelization
Considerable research on loop parallelization for shared memory multiprocessors has focused upon developing transformations for removing loop-carried dependences. In many loops, more than one such transformation is required, and hence the choice of transformations and the order in which they are applied is critical.
In this paper, we present an algorithm for selecting a sequence of transformations which, applied to a given loop, will yield an equivalent maximally parallel loop.
The algorithms provided in the paper have been implemented and tested in PAT, a tool for interactive parallelization of Fortran.
Unable to display preview. Download preview PDF.
- Allen, R., Callahan, D., and Kennedy, K. Automatic decomposition of scientific programs for parallel execution. In Principles of Programming Languages (1987), pp. 63–76.Google Scholar
- Appelbe, B., McDowell, C., and Smith, K. Start/pat: A parallel-programming toolkit. IEEE Software 6, 4 (July 1989), 29–38.Google Scholar
- Appelbe, B., and Smith, K. Analyzing loops for parallelism. Tech. Rep. GIT-ICS-90/59, Georgia Institute of Technology, Nov. 1990.Google Scholar
- Appelbe, B., and Smith, K. Determining transformation sequences for loop parallelization. Tech. Rep. GIT-ICS-92/59, Georgia Institute of Technology, Nov. 1992.Google Scholar
- Balasundarum, V. Itereactive Parallelization of Numerical Scientific Programs. PhD thesis, Rice University, June 1989. Regular Sections summarize dependences in programs.Google Scholar
- Callahan, D. A Global Approach to Detection of Parallelism. PhD thesis, Rice University, 1987. Rice Tech Report, COMP TR87-50.Google Scholar
- Ferrante, J., Ottenstein, K. J., and Warren, J. D. The program dependence graph and its use in optimization. ACM Transactions on Programming Languages and Systems (1987).Google Scholar
- Kennedy, K., and McKinley, K. Optimizing for parallelism and data locality. In International Conference on Supercomputing (July 1992), pp. 323–334.Google Scholar
- Pugh, W. The omega test, a fast and practical integer programming algorithm for dependence analysis. In Supecomputing '91 (Nov. 1991), pp. 4–13.Google Scholar
- Smith, K. S. PAT: An Interactive Fortran Parallelizing Assistant Tool. PhD thesis, Georgia Institute of Technology, December 1988.Google Scholar
- Wolf, M. E., and Lam, M. S. A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems 2, 4 (October 1991), 452–482.Google Scholar
- Wolfe, M.Optimizing Supercompilers for Supercomputers. The MIT Press, Cambridge, Massachusetts, 1989.Google Scholar
- Zima, H., and Chapman, B.Supercompilers for Parallel and Vector Computers. ACM Press, New York, New York, 1990.Google Scholar