The Hierarchical Factor Algorithm for All-to-All Communication
We present an algorithm for regular, personalized all- to- all communication, in which every processor has an individual message to deliver to every other processor. Our machine model is a cluster of processing nodes where each node, possibly consisting of several processors, can participate in only one communication operation with another node at a time. The nodes may have different numbers of processors. This general model is important for the implementation of all-to-all communication in libraries such as MPI where collective communication may take place over arbitrary subsets of processors. The algorithm is optimal up to an additive term that is small if the total number of processors is large compared to the maximal number of processors in a node.
Unable to display preview. Download preview PDF.
- 3.F. Harary. Graph Theory. Addison-Wesley, 1967.Google Scholar
- 4.L. P. Huse. MPI optimization for SMP based clusters interconnected with SCI. In 7th European PVM/MPI User’s Group Meeting, volume 1908 of Lecture Notes in Computer Science, pages 56–63, 2000.Google Scholar
- 5.N. T. Karonis, B. R. de Supinski, I. Foster, W. Gropp, E. Lusk, and J. Bresnahan. Exploiting hierarchy in parallel computer networks to optimize collective operation performance. In Proceedings of International Parallel and Distributed Processing Symposium (IPDPS’2000), pages 377–384, 2000.Google Scholar
- 7.D. S. Scott. Efficient all-to-all communication patterns in hypercube and mesh topologies. In Sixth Distributed Memory Computing Conference Proceedings, pages 398–403, 1991.Google Scholar
- 8.S. Sistare, R. vandeVaart, and E. Loh. Optimization of MPI collectives on clusters of large-scale SMPs. In Supercomputing, 1999. http://www.supercomp.org/sc99/proceedings/techpap.htm/#mpi.
- 9.M. Snir, S. Otto, S. Huss-Lederman, D. Walker, and J. Dongarra. MPI — The Complete Reference, volume 1, The MPI Core. MIT Press, second edition, 1998.Google Scholar