A Loop Partitioning Method by Implementation of Gaussian Elimination
Extracting parallelism from nested loop with Unimodular Matrix Transformation gains the benefits of efficiency, flexibility and easy coding in realization of auto-restructuring compiler. With unimodular matrix transformation, the partitioning matrix dominates the number of independent parallel sets and the complexity of this transformation. In this paper, we propose a fast method to derive the required partitioning matrix for the loop parallelization. This method can quickly and easily derive the partitioning matrix on the basis of data dependence Distance Matrix and Modified Gauss Elimination. Examples show the complexity of computational time is close to 1/2 × (n × m) in most cases and the partitioning numbers are still kept in optimal values as those proposed by previous research works. Finally, we emphasize that this method is more efficient in deeper nested loop especially for loop depth n ≥ 4.
KeywordsUnimodular matrix transformation Data dependence distance matrix Gauss elimination
- 1.Banerjee, U., Chen, S. C., Kuck, D., & Towle, R. (1979). Time and parallel processor bounds for fortran-like loops. IEEE Transactions on Computers, C-28, 660–670.Google Scholar
- 2.Banerjee, U. (1990). Unimodular transformations of double loops, Proceedings of the Third Workshop on Languages and Compilers for Parallel Computing, Irvine, California, August 1–3, 1990.Google Scholar
- 3.D’Hollander, E. H. (1991). Partitioning and labeling of loops by unimodular transformations. IEEE Transactions on Parallel and Distributed Systems, 3(4), 465–476.Google Scholar
- 6.Sheu, J.-P., & Tai, T.-H. (1991). Partitioning and mapping nested loops on multiprocessor systems. IEEE Transactions on Parallel and Distributed Systems, 2(4), 430–439.Google Scholar
- 7.Wolf, M. E., & Lam, M. S. (1991). A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems, 2(4), 452–471.Google Scholar