The Hierarchical Factor Algorithm for All-to-All Communication

  • Peter Sanders
  • Jesper Larsson Träff
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2400)


We present an algorithm for regular, personalized all- to- all communication, in which every processor has an individual message to deliver to every other processor. Our machine model is a cluster of processing nodes where each node, possibly consisting of several processors, can participate in only one communication operation with another node at a time. The nodes may have different numbers of processors. This general model is important for the implementation of all-to-all communication in libraries such as MPI where collective communication may take place over arbitrary subsets of processors. The algorithm is optimal up to an additive term that is small if the total number of processors is large compared to the maximal number of processors in a node.


Complete Graph Hierarchical System Factor Algorithm Loop Iterate Collective Communication 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    J. Bruck, C.-T. Ho, S. Kipnis, E. Upfal, and D. Weathersby. Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Transactions on Parallel and Distributed Systems, 8(11):1143–1156, 1997.CrossRefGoogle Scholar
  2. 2.
    S. E. Hambrusch, F. Hameed, and A. A. Khokar. Communication operations on coarse-grained mesh architectures. Parallel Computing, 21:731–751, 1995.zbMATHCrossRefGoogle Scholar
  3. 3.
    F. Harary. Graph Theory. Addison-Wesley, 1967.Google Scholar
  4. 4.
    L. P. Huse. MPI optimization for SMP based clusters interconnected with SCI. In 7th European PVM/MPI User’s Group Meeting, volume 1908 of Lecture Notes in Computer Science, pages 56–63, 2000.Google Scholar
  5. 5.
    N. T. Karonis, B. R. de Supinski, I. Foster, W. Gropp, E. Lusk, and J. Bresnahan. Exploiting hierarchy in parallel computer networks to optimize collective operation performance. In Proceedings of International Parallel and Distributed Processing Symposium (IPDPS’2000), pages 377–384, 2000.Google Scholar
  6. 6.
    P. Sanders and R. Solis-Oba. How helpers hasten h-relations. Journal of Algorithms, 41:86–98, 2001.zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    D. S. Scott. Efficient all-to-all communication patterns in hypercube and mesh topologies. In Sixth Distributed Memory Computing Conference Proceedings, pages 398–403, 1991.Google Scholar
  8. 8.
    S. Sistare, R. vandeVaart, and E. Loh. Optimization of MPI collectives on clusters of large-scale SMPs. In Supercomputing, 1999.
  9. 9.
    M. Snir, S. Otto, S. Huss-Lederman, D. Walker, and J. Dongarra. MPI — The Complete Reference, volume 1, The MPI Core. MIT Press, second edition, 1998.Google Scholar
  10. 10.
    Y. Yang and J. Wang. Optimal all-to-all personalized exchange in self-routable multistage networks. IEEE Transactions on Parallel and Distributed Systems, 11(3):261–274, 2000.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Peter Sanders
    • 1
  • Jesper Larsson Träff
    • 2
  1. 1.Max-Planck-Institut für InformatikSaarbrückenGermany
  2. 2.C&C Research LaboratoriesNEC Europe Ltd.Sankt AugustinGermany

Personalised recommendations