Communication Generation for Cyclic(K) Distributions
Communication resulting from references to arrays with general cyclic(k) distributions in data-parallel programs is not amenable to existing analyses developed for block and cyclic distributions. The methods for communication generation presented in this paper are based on exploiting the repetitive nature of array accesses. We represent array access patterns as periodic sequences and use these sequences for efficient communication analysis and code generation. Our approach allows us to incorporate message coalescing optimization and to use overlap areas for shift communication. Experimental results from our prototype implementation demonstrate the validity of the proposed techniques.
KeywordsFinite State Machine Array Element Virtual View Shift Communication Interprocessor Communication
Unable to display preview. Download preview PDF.
- S. Chatterjee, J. Gilbert, F. Long, R. Schreiber, and S. Teng. Generating local addresses and communication sets for data-parallel programs. In Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, CA, May 1993.Google Scholar
- R. Das, M. Uysal, J. Saitz, and Y-S. Hwang. Communication optimizations for irregular scientific computations on distributed memory architectures. Technical Report CS-TR3163, Dept. of Computer Science, Univ. of Maryland, College Park, October 1993.Google Scholar
- S.K.S. Gupta, S.D. Kaushik, C.-H. Huang, and P. Sadayappan. On compiling array expressions for efficient execution on distributed-memory machines. Technical Report OSU-CISRC-4/94-TR19, Department of Computer and Information Science, The Ohio State University, April 1994.Google Scholar
- S. Hiranandani, K. Kennedy, and C.-W. Tseng. Compiler optimizations for Fortran D on MIMD distributed-memory machines. In Proceedings of Supercomputing ‘81, Albuquerque, NM, November 1991.Google Scholar
- S. D. Kaushik. Private communication, April 1995.Google Scholar
- K. Kennedy, N. Nedeljković, and A. Sethi. Efficient address generation for block-cyclic distributions. In Proceedings of the 1995 ACM International Conference on Supercomputing, Barcelona, Spain, July 1995.Google Scholar
- K. Kennedy, N. Nedeljković, and A. Sethi. A linear-time algorithm for computing the memory access sequence in data-parallel programs. In Proceedings of the Fifth ACM SIGPLANSymposium on Principles and Practice of Parallel Programming, Santa Barbara, CA, July 1995.Google Scholar
- J. Stichnoth. Efficient compilation of array statements for private memory multicomputers. Technical Report CMU-CS-93–109, School of Computer Science, Carnegie Mellon University, February 1993.Google Scholar