Forward Communication Only Placements and Their Use for Parallel Program Construction

  • Martin Griebl
  • Paul Feautrier
  • Armin Größlinger
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2481)

Abstract

The context of this paper is automatic parallelization by the space-time mapping method. One key issue in that approach is to adjust the granularity of the derived parallelism. For that purpose, we use tiling in the space and time dimensions. While space tiling is always legal, there are constraints on the possibility of time tiling, unless the placement is such that communications always go in the same direction (forward communications only). We derive an algorithm that automatically constructs an FCO placement – if it exists. We show that the method is applicable to many familiar kernels and that it gives satisfactory speedups.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Boulet, P., Darte, A., Risset, T., Robert, Y.: (Pen)-ultimate tiling? Integration 17, 33–51 (1994)Google Scholar
  2. 2.
    Collard, J.-F., Griebl, M.: A precise fixpoint reaching definition analysis for arrays. In: Carter, L., Ferrante, J. (eds.) LCPC 1999. LNCS, vol. 1863, pp. 286–302. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  3. 3.
    Dion, M., Robert, Y.: Mapping affine loop nests: New results. In: Hertzberger, B., Serazzi, G. (eds.) HPCN-Europe 1995. LNCS, vol. 919, pp. 184–189. Springer, Heidelberg (1995)CrossRefGoogle Scholar
  4. 4.
    Feautrier, P.: Dataflow analysis of array and scalar references. Int. J. Parallel Programming 20(1), 23–53 (1991)MATHCrossRefGoogle Scholar
  5. 5.
    Feautrier, P.: Some efficient solutions to the affine scheduling problem. Part I. One-dimensional time. Int. J. Parallel Programming 21(5), 313–348 (1992)MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Feautrier, P.: Toward automatic distribution. Parallel Processing Letters 4(3), 233–244 (1994)CrossRefGoogle Scholar
  7. 7.
    Feautrier, P.: Automatic parallelization in the polytope model. In: Perrin, G.-R., Darte, A. (eds.) The Data Parallel Programming Model. LNCS, vol. 1132, pp. 79–103. Springer, Heidelberg (1996)Google Scholar
  8. 8.
    Feautrier, P.: Automatic distribution of data and computation. Technical Report 2000/3, Laboratoire PRiSM, Université de Versailles (March 2000); English translation of TSI 15, 529–557 (1996), http://www.prism.uvsq.fr/rapports/2000/abstract20003.html
  9. 9.
    Griebl, M.: The Mechanical Parallelization of Loop Nests Containing while Loops. PhD thesis, Fakultät für Mathematik und Informatik, Universität Passau, Technical Report MIP-9701 (January 1997)Google Scholar
  10. 10.
    Griebl, M.: On the mechanical tiling of space-time mapped loop nests. Technical Report MIP-0009, Fakultät für Mathematik und Informatik, Universität Passau (August 2000)Google Scholar
  11. 11.
    Griebl, M.: On tiling space-time mapped loop nests. In: Thirteenth annual ACM symposium on parallel algorithms and architectures (SPAA 2001), July 2001, pp. 322–323 (2001)Google Scholar
  12. 12.
    Griebl, M., Feautrier, P.A., Lengauer, C.: Index set splitting. Int. J. Parallel Programming 28(6), 607–631 (2000)CrossRefGoogle Scholar
  13. 13.
    Hodžić, E., Shang, W.: On time optimal supernode shape. In: Eighth Int. Workshop on Compilers for Parallel Computers (CPC 2000), pp. 367–379. CRC Press, Boca Raton (2000)Google Scholar
  14. 14.
    Högstedt, K., Carter, L., Ferrante, J.: Selecting tile shape for minimal execution time. In: 11th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA 1999), pp. 201–211. ACM Press, New York (1999); Also available with proofs as UCSD Tech Report CS99-616CrossRefGoogle Scholar
  15. 15.
    Irigoin, F., Triolet, R.: Supernode partitioning. In: Proc. 15th Ann. ACM Symp. on Principles of Programming Languages (POPL 1988), pp. 319–329. ACM Press, San Diego (1988)CrossRefGoogle Scholar
  16. 16.
    Lengauer, C.: Loop parallelization in the polytope model. In: Best, E. (ed.) CONCUR 1993. LNCS, vol. 715, pp. 398–416. Springer, Heidelberg (1993)Google Scholar
  17. 17.
    Lim, A.W., Lam, M.S.: Maximizing parallelism and minimizing synchronization with affine partitions. Parallel Computing 24(3–4), 445–475 (1998)MATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    Reed, D.A., Adams, L.M., Patrick, M.L.: Stencils and problem partitionings: Their influence on the performance of multiple processor systems. IEEE Trans. on Computers C-36(7), 845–858 (1987)CrossRefGoogle Scholar
  19. 19.
    Schreiber, R., Dongarra, J.J.: Automatic blocking of nested loops. Technical Report CS-90-108, University of Tennessee, Computer Science (May 1990)Google Scholar
  20. 20.
    Schrijver, A.: Theory of Linear and Integer Programming. Series in Discrete Mathematics. John Wiley & Sons, Chichester (1986)MATHGoogle Scholar
  21. 21.
    Wilde, D.K.: A library for doing polyhedral operations. Technical Report 785, IRISA (December 1993)Google Scholar
  22. 22.
    Wolf, M., Lam, M.: A loop transformation theory and an algorithm to maximize parallelism. IEEE Trans. on Parallel and Distributed Systems 2(4), 452–471 (1991)CrossRefGoogle Scholar
  23. 23.
    Wolfe, M.: Iteration space tiling for memory hierarchies. In: Rodrigue, G. (ed.) Proc. of the 3rd conference on Parallel Processing for Scientific Computing, pp. 357–361. SIAM, Philadelphia (1989)Google Scholar
  24. 24.
    Xue, J.: Communication-minimal tiling of uniform dependence loops. J. Parallel and Distributed Computing 42(1), 42–59 (1997)CrossRefGoogle Scholar
  25. 25.
    Xue, J.: On tiling as a loop transformation. Parallel Processing Letters 7(4), 409–424 (1997)CrossRefMathSciNetGoogle Scholar
  26. 26.
    Xue, J., Huang, C.-H.: Reuse-driven tiling for improving data locality. Int. J. Parallel Programming 26(6), 671–696 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Martin Griebl
    • 1
  • Paul Feautrier
    • 2
  • Armin Größlinger
    • 1
  1. 1.FMIUniversity of PassauGermany
  2. 2.Unité de Recherche de RocquencourtINRIAFrance

Personalised recommendations