Slicing analysis and indirect accesses to distributed arrays
An increasing fraction of the applications targeted by parallel computers makes heavy use of indirection arrays for indexing data arrays. Such irregular access patterns make it difficult for a compiler to generate efficient parallel code. Previously developed techniques addressing this problem are limited in that they are only applicable for a single level of indirection. However, many codes using sparse data structures access their data through multiple levels of indirection.
This paper presents a method for transforming programs using multiple levels of indirection into programs with at most one level of indirection, thereby broadening the range of applications that a compiler can parallelize efficiently. A central concept of our algorithm is to perform program slicing on the subscript expressions of the indirect array accesses. Such slices peel off the levels of indirection, one by one, and create opportunities for aggregated data prefetching in between. A slice graph eliminates redundant preprocessing and gives an ordering in which to compute the slices. We present our work in the context of High Performance Fortran; an implementation in a Fortran D prototype compiler is in progress.
KeywordsControl Flow Graph Local Array Runtime Support Trace Array Irregular Problem
Unable to display preview. Download preview PDF.
- 1.B. Chapman, P. Mehrotra, and H. Zima. Programming in Vienna Fortran. Scientific Programming, 1(1):31–50, Fall 1992.Google Scholar
- 2.T. W. Clark, R. v. Hanxleden, K. Kennedy, C. Koelbel, and L.R. Scott. Evaluating parallel languages for molecular dynamics computations. In Proceedings of the Scalable High Performance Computing Conference (SHPCC-9S), pages 98–105. IEEE Computer Society Press, April 1992. Available via anonymous ftp from softlib.rice.edu as pub/CRPC-TRs/report8/CRPC-TR992202.Google Scholar
- 3.R. Das, R. Ponnusamy, J. Saltz, and D. Mavriplis. Distributed memory compiler methods for irregular problems — data copy reuse and runtime partitioning. In J. Saltz and P. Mehrotra, editors, Languages, Compilers and Runtime Environments for Distributed Memory Machines, pages 185–220. Elsevier, 1992.Google Scholar
- 4.R. Das, J. Saltz, D. Mavriplis, and R. Ponnusamy. The incremental scheduler. In Unstructured Scientific Computation on Scalable Multiprocessors, Cambridge Mass, 1992. MIT Press.Google Scholar
- 5.G. Fox, S. Hiranandani, K. Kennedy, C. Koelbel, U. Kremer, C. Tseng, and M. Wu. Fortran D language specification. Technical Report CRPC-TR90079, Center for Research on Parallel Computation, Rice University, December 1990.Google Scholar
- 6.R. v. Hanxleden. Handling irregular problems with Fortran D — A preliminary report. In Proceedings of the Fourth Workshop on Compilers for Parallel Computers, Delft, The Netherlands, December 1993. Available via anonymous ftp from softlib.rice.edu as pub/CRPC-TRs/reports/CRPC-TR93339-S.Google Scholar
- 7.R. v. Hanxleden and K. Kennedy. A code placement framework and its application to communication generation. Technical Report CRPC-TR93337-S, Center for Research on Parallel Computation, October 1993. Available via anonymous ftp from softlib.rice.edu as pub/CRPC-TRs/reports/CRPC-TR93337-S.Google Scholar
- 8.R. v. Hanxleden, K. Kennedy, C. Koelbel, R. Das, and J. Saltz. Compiler analysis for irregular problems in Fortran D. In Proceedings of the Fifth Workshop on Languages and Compilers for Parallel Computing, New Haven, CT, August 1992. Available via anonymous ftp from softlib.rice.edu as pub/CRPC-TRs/reports/CRPC-TR92287-S.Google Scholar
- 9.High Performance Fortran Forum. High Performance Fortran language specification, version 1.0. Technical Report CRPC-TR92225, Center for Research on Parallel Computation, Rice University, Houston, TX, 1992 (revised Jan. 1993). To appear in Scientific Programming, July 1993.Google Scholar
- 10.S. Hiranandani, K. Kennedy, and C. Tseng. Compiler optimizations for Fortran D on MIMD distributed-memory machines. In Proceedings Supercomputing '91, pages 86–100. IEEE Computer Society Press, November 1991.Google Scholar
- 12.S. Hiranandani, K. Kennedy, and C. Tseng. Evaluation of compiler optimizations for Fortran D on MIMD distributed-memory machines. In Proceedings of the Sixth International Conference on Supercomputing. ACM Press, July 1992.Google Scholar
- 15.R. Mirchandaney, J. H. Saltz, R. M. Smith, D. M. Nicol, and Kay Crowley. Principles of runtime support for parallel processors. In Proceedings of the 1988 ACM International Conference on Supercomputing, pages 140–152, July 1988.Google Scholar
- 16.Y. Saad. Sparsekit: a basic tool kit for sparse matrix computations. Report 90–20, RIACS, 1990.Google Scholar
- 17.J. Saltz, H. Berryman, and J. Wu. Multiprocessors and runtime compilation. Technical Report 90–59, ICASE, NASA Langley Research Center, September 1990.Google Scholar
- 18.J. Saltz, R. Das, R. Ponnusamy, D. Mavriplis, H Berryman, and J. Wu. Parti procedures for realistic loops. In Proceedings of the 6th Distributed Memory Computing Conference, Portland, Oregon, April–May 1991.Google Scholar
- 19.G. A. Venkatesh. The semantic approach to program slicing. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, pages 107–119, June 1991.Google Scholar
- 20.H. Zima, P. Brezany, B. Chapman, P. Mehrotra, and A. Schwald. Vienna Fortran — A language specification, version 1.1. Interim Report 21, ICASE, NASA Langley Research Center, March 1992.Google Scholar