A Loop Partitioning Method by Implementation of Gaussian Elimination

Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 293)

Abstract

Extracting parallelism from nested loop with Unimodular Matrix Transformation gains the benefits of efficiency, flexibility and easy coding in realization of auto-restructuring compiler. With unimodular matrix transformation, the partitioning matrix dominates the number of independent parallel sets and the complexity of this transformation. In this paper, we propose a fast method to derive the required partitioning matrix for the loop parallelization. This method can quickly and easily derive the partitioning matrix on the basis of data dependence Distance Matrix and Modified Gauss Elimination. Examples show the complexity of computational time is close to 1/2 × (n × m) in most cases and the partitioning numbers are still kept in optimal values as those proposed by previous research works. Finally, we emphasize that this method is more efficient in deeper nested loop especially for loop depth n ≥ 4.

Keywords

Unimodular matrix transformation Data dependence distance matrix Gauss elimination 

References

  1. 1.
    Banerjee, U., Chen, S. C., Kuck, D., & Towle, R. (1979). Time and parallel processor bounds for fortran-like loops. IEEE Transactions on Computers, C-28, 660–670.Google Scholar
  2. 2.
    Banerjee, U. (1990). Unimodular transformations of double loops, Proceedings of the Third Workshop on Languages and Compilers for Parallel Computing, Irvine, California, August 1–3, 1990.Google Scholar
  3. 3.
    D’Hollander, E. H. (1991). Partitioning and labeling of loops by unimodular transformations. IEEE Transactions on Parallel and Distributed Systems, 3(4), 465–476.Google Scholar
  4. 4.
    Lamport, L. (1974). The parallel execution of DO loops. Communications of the ACM, 17, 83–93.CrossRefMATHMathSciNetGoogle Scholar
  5. 5.
    Peir, J. K., & Cytron, R. (1989). Minimum distance: A method for partitioning recurrences for multiprocessors. IEEE Transactions on Computer, 38, 1203–1211.CrossRefGoogle Scholar
  6. 6.
    Sheu, J.-P., & Tai, T.-H. (1991). Partitioning and mapping nested loops on multiprocessor systems. IEEE Transactions on Parallel and Distributed Systems, 2(4), 430–439.Google Scholar
  7. 7.
    Wolf, M. E., & Lam, M. S. (1991). A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems, 2(4), 452–471.Google Scholar
  8. 8.
    Wolfe, M. J. (1986). Loop skewing : The wavefront method revisited. International Journal of Parallel Programming, 15, 279–293.CrossRefMATHMathSciNetGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Electrical EngineeringCheng Shiu UniversityKaohsiungTaiwan, Republic of China

Personalised recommendations