On the optimality of Allen and Kennedy's algorithm for parallelism extraction in nested loops

  • Alain Darte
  • Frédéric Vivien
Workshop 03 Automatic Parallelization and High Performance Compilers
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1123)


We explore the link between dependence abstractions and maximal parallelism extraction in nested loops. Our goal is to find, for each dependence abstraction, the minimal transformations needed for maximal parallelism extraction. The result of this paper is that Allen and Kennedy's algorithm is optimal when dependences are approximated by dependence levels. This means that even the most sophisticated algorithm cannot detect more parallelism than found by Allen and Kennedy's algorithm, as long as dependence level is the only information available.


Clock Cycle Dependence Graph Nest Loop Dependence Level Parallel Loop 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    J.R. Allen and K. Kennedy. Automatic translations of Fortran programs to vector form. ACM Toplas, 9:491–542, 1987.CrossRefGoogle Scholar
  2. 2.
    U. Banerjee. Dependence Analysis for Supercomputing. Kluwer Academic Publishers, Norwell, MA, 1988.Google Scholar
  3. 3.
    D. Callahan. A Global Approach to Detection of Parallelism. PhD thesis, Dept. of Computer Science, Rice University, Houston, TX, 1987.Google Scholar
  4. 4.
    Alain Darte and Frédéric Vivien. A classification of nested loops parallelization algorithms. In INRIA-IEEE Symposium on Emerging Technologies and Factory Automation, pages 217–224. IEEE Computer Society Press, 1995.Google Scholar
  5. 5.
    Alain Darte and Frédéric Vivien. On the optimality of Allen and Kennedy's algorithm for parallelism extraction in nested loops. Technical Report 96-05, LIP, ENS-Lyon, France, February 1996. Extended version of Europar'96.Google Scholar
  6. 6.
    Alain Darte and Frédéric Vivien. Optimal fine and medium grain parallelism in polyhedral reduced dependence graphs. In Proceedings of PACT'96, Boston, MA, October 1996. IEEE Computer Society Press. To appear.Google Scholar
  7. 7.
    Paul Feautrier. Dataflow analysis of array and scalar references. Int. J. Parallel Programming, 20(1):23–51, 1991.CrossRefGoogle Scholar
  8. 8.
    Paul Feautrier. Some efficient solutions to the affine scheduling problem, part II, multi-dimensional time. Int. J. Parallel Programming, 21(6):389–420, December 1992.CrossRefGoogle Scholar
  9. 9.
    G. Goff, K. Kennedy, and C.W. Tseng. Practical dependence testing. In Proceedings of ACM SIGPLAN'91 Conference on Programming Language Design and Implementation, Toronto, Canada, June 1991.Google Scholar
  10. 10.
    F. Irigoin, P. Jouvelot, and R. Triolet. Semantical interprocedural parallelization: an overview of the PIPS project. In Proceedings of the 1991 ACM International Conference on Supercomputing, Cologne, Germany, June 1991.Google Scholar
  11. 11.
    F. Irigoin and R. Triolet. Computing dependence direction vectors and dependence cones with linear systems. Technical Report ENSMP-CAI-87-E94, Ecole des Mines de Paris, Fontainebleau (France), 1987.Google Scholar
  12. 12.
    R.M. Karp, R.E. Miller, and S. Winograd. The organization of computations for uniform recurrence equations. Journal of the ACM, 14(3):563–590, July 1967.CrossRefGoogle Scholar
  13. 13.
    X.Y. Kong, D. Klappholz, and K. Psarris. The I test: a new test for subscript data dependence. In Padua, editor, Proceedings of 1990 International Conference of Parallel Processing, August 1990.Google Scholar
  14. 14.
    Leslie Lamport. The parallel execution of DO loops. Communications of the ACM, 17(2):83–93, February 1974.CrossRefGoogle Scholar
  15. 15.
    Z.Y. Li, P.-C. Yew, and C.Q. Zhu. Data dependence analysis on multi-dimensional array references. In Proceedings of the 1989 ACM International Conference on Supercomputing, pages 215–224, Crete, Greece, June 1989.Google Scholar
  16. 16.
    Y. Muraoka. Parallelism exposure and exploitation in programs. PhD thesis, Dept. of Computer Science, University of Illinois at Urbana-Champaign, February 1971.Google Scholar
  17. 17.
    William Pugh. The Omega test: a fast and practical integer programming algorithm for dependence analysis. Communications of the ACM, 8:102–114, August 1992.CrossRefGoogle Scholar
  18. 18.
    Michael E. Wolf and Monica S. Lam. A loop transformation theory and an algorithm to maximize parallelism. IEEE Trans. Parallel Distributed Systems, 2(4):452–471, October 1991.CrossRefGoogle Scholar
  19. 19.
    Michael Wolfe. Optimizing Supercompilers for Supercomputers. MIT Press, Cambridge MA, 1989.Google Scholar
  20. 20.
    Y.-Q. Yang, C. Ancourt, and F. Irigoin. Minimal data dependence abstractions for loop transformations. International Journal of Parallel Programming, 23(4):359–388, August 1995.Google Scholar
  21. 21.
    Yi-Qing Yang. Tests des dépendances et transformations de programme. PhD thesis, Ecole Nationale Supérieure des Mines de Paris, Fontainebleau, France, 1993.Google Scholar
  22. 22.
    Hans Zima and Barbara Chapman. Supercompilers for Parallel and Vector Computers. ACM Press, 1990.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • Alain Darte
    • 1
  • Frédéric Vivien
    • 1
  1. 1.URA CNRS 1398, ENS-LyonLIPLyon Cedex 07France

Personalised recommendations