Evaluation of automatic parallelization strategies for HPF compilers
In the data parallel programming style the user usually specifies the data parallelism explicitly so that the compiler can generate efficient code without enhanced analysis techniques.
In some situations it is not possible to specify the parallelism explicitly or this might be not very convenient. This is especially true for loop nests with data dependences between the data of distributed dimensions.
In the case of uniform loop nests there are scheduling, mapping and partitioning techniques available. Some different strategies have been considered and evaluated with existing High Performance Fortran compilation systems. This paper gives some results about the performance and the benefits of the different techniques and optimizations. The results are intended to direct the future development of data parallel compilers.
Unable to display preview. Download preview PDF.
- 1.FORGE 90. xHPF 1.0 Automatic Parallelizer for High Performance Fortran on Distributed Memory Systems — User's Guide. Technical report, Applied Parallel Research, Inc., April 1993.Google Scholar
- 2.Pierre Boulet. The bouclettes loop parallelizer. Research Report 95-40, Laboratoire de l'Informatique de Paralllisme, Nov 1995.Google Scholar
- 3.Z. Bozkus, A. Choudhary, G. Fox, T. Haupt, and S. Ranka. Fortran 90D/HPF Compiler for Distributed Memory MIMD Computers: Design, Implementation, and Performance Results. Technical Report, Syracuse Center for Computational Science, April 1993.Google Scholar
- 4.Th. Brandes and F. Zimmermann. ADAPTOR — A Transformation Tool for HPF Programs. In K.M. Decker and R.M. Rehmann, editors, Programming Environments for Massively Parallel Distributed Systems, pages 91–96. Birkhäuser, April 1994.Google Scholar
- 5.High Performance Fortran Forum. High performance fortran language specification. Technical report, Rice University, January 1993.Google Scholar
- 6.Stanford Compiler Group. Suif compiler system. World Wide Web document, URL: http://suif.stanford.edu/suif/suif.html.Google Scholar
- 7.The group of Pr. Lengauer. The loopo project. World Wide Web document, URL: http://brahms.fmi.uni-passau.de/cl/loopo/index.html.Google Scholar
- 8.S. Hiranandani, K. Kennedy, and C.-W. Tseng. Evaluating Compiler Optimizations for Fortran D. Journal of Parallel and Distributed Computing, 21:27–45, April 1994.Google Scholar
- 9.C. Koelbel and P. Mehrotra. Compiling global name-space parallel loops for distributed execution. IEEE Transactions on Parallel and Distributed Systems, October 1991.Google Scholar
- 10.Charles H. Koelbel, David B. Loveman, Robert S. Schreiber, Guy L. Steele Jr., and Mary E. Zosel. The High Performance Fortran Handbook. The MIT Press, 1994.Google Scholar
- 11.J. Merlin. ADAPTing Fortran 90 Array Programs for Distributed Memory Architectures. In Proc. 1st International Conference of the Austrian Center for Parallel Computation, Salzburg, October 1991.Google Scholar
- 12.PGHPF. Reference Manual, User's Guide. Technical report, The Portland Group, Inc., November 1994.Google Scholar
- 13.William Pugh and the Omega Team. The omega project. World Wide Web document, URL: http://www.cs.umd.edu/projects/omega/index.html.Google Scholar
- 14.PIPS Team. Pips (interprocedural parallelizer for scientific programs). World Wide Web document, URL: http://www.cri.ensmp.fr/∼pips/index.html.Google Scholar
- 15.PRiSM SCPDP Team. Systematic construction of parallel and distributed programs. World Wide Web document, URL: http://www.prism.uvsq.fr/english/parallel/paf/autom_us.html.Google Scholar
- 16.H. Zima, H. Bast, and M. Gerndt. Superb: A Tool for Semi-Automatic SIMD/MIMD Parallelizatin. Parallel Computing, January 1988.Google Scholar