Automatic data decomposition for message-passing machines
The data distribution problem is very complex, because it involves trade-off decisions between minimizing communication and maximizing parallelism. A common approach towards solving this problem is to break the data mapping into two stages: an alignment stage and a distribution stage. The alignment stage attempts to increase parallelism, while the distribution stage attempts to decrease communication overhead. As opposed to previous approaches, we consider the alignment and distribution problems in a unified framework, and attempt to simultaneously maximize parallelism and minimize communication overhead. The problem becomes harder if dynamic remapping, multi-dimensional distributions, array replications and control flow are taken into account. This paper formulates the full data decomposition problem that addresses all these issues and presents a simple new algorithm to find the optimal solution of the dynamic data distribution problem, given the number of processors and a partitioning of the input program into phases. The algorithm runs efficiently for small search spaces (several hundreds of data distributions).
Unable to display preview. Download preview PDF.
- 1.Robert P. Wilson, Robert S. French, Christopher S. Wilson, Saman P. Amarasinghe, Jennifer M. Anderson, Steve W. K. Tjiang, Shih-Wei Liao, Chau-Wen Tseng, Mary W. Hall, Monica S. Lam, and John L. Hennessy SUIF: An Infrastructure for Research on Parallelizing and Optimizing Compilers. Computer Systems Laboratory Stanford University, CA 94305-4055.Google Scholar
- 2.J. Anderson and M. Lam. Global optimizations for parallelism and locality on scalable parallel machines. Proceedings of the Sigplan '93 Conference on Program Language design and Implementation, Albuquerque, NM, June 1993.Google Scholar
- 3.Manish Gupta and Prithviraj Banerjee. PARADIGM: A Compiler for Automatic Data Distribution on Multicomputers. Proceedings of the ACM International Conference on Supercomputing, Tokyo, Japan, July 1993.Google Scholar
- 4.Daniel J. Palermo and Prithviraj Banerjee. Automatic selection of Dynamic Data Partitioning Schemes for Distributed-Memory Multicomputers. Proceedings of the 8th Workshop on Languages and Compilers for Parallel Computing, Columbus, OH., Aug. 1995.Google Scholar
- 5.Mirela Damian-Iordache and Sriram V. Pemmaraju. Automatic Data Decomposition for Message-Passing Machines Technical Report TR 97-01, The University of Iowa, Department of Computer Science.Google Scholar
- 6.Jordi Garcia, Eduard Ayguade, and Jesus Labarta. A Novel Approach Towards Automatic Data Distribution. Proceedings of the Supercomputing '95 Conference, San Diego, CA, December 1995.Google Scholar
- 7.Jordi Garcia, Eduard Ayguade, and Jesus Labarta. A Framework for Automatic Dynamic Data Distribution. Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing, New Orleans, LA, October 1996.Google Scholar
- 8.Jordi Garcia, Eduard Ayguade, and Jesus Labarta. Dynamic Data Distribution with Control Flow Analysis. Proceedings of the Supercomputing'96 Conference, Pittsburgh, PA, November 1996.Google Scholar
- 9.Ulrich Kremer. Automatic Data Layout for High Performance Fortran. Ph.D. dissertation, Rice University, Houston, TX, Oct. 1995, CRPC-TR95559-S.Google Scholar
- 10.Ken Kennedy and Ulrich Kremer. Initial Framework for Automatic Data Layout in Fortran D: A Short Update on a Case Study. CRPC-TR93324-S, Rice University, July 1993.Google Scholar
- 11.Ulrich Kremer. NP-completeness of Dynamic Remapping. CRPC-TR93330-S, Rice University, August 1993.Google Scholar
- 12.S. Chatterjee, J. Gilbert, L. Oliker, R. Schreiber, and T. Sheflier. Algorithms for dynamic alignment of arrays. Journal of Parallel and Distributed Computing 38, 145–157, 1996.Google Scholar
- 13.G. R. Carmichael, L.K. Peters, and R. D. Saylor. The STEM-II regional scale acid deposition and photochemical oxidation model-I. An overview of model development and applications. Atmospheric Environment, 25A, 2077–2090, 1991.Google Scholar