Process Arrival Pattern Aware Alltoall and Allgather on InfiniBand Clusters

Qian, Ying; Afsahi, Ahmad

doi:10.1007/s10766-010-0152-3

Process Arrival Pattern Aware Alltoall and Allgather on InfiniBand Clusters

Published: 24 October 2010

Volume 39, pages 473–493, (2011)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Ying Qian¹ &
Ahmad Afsahi¹

182 Accesses
7 Citations
Explore all metrics

Abstract

Recent studies show that MPI processes in real applications could arrive at an MPI collective operation at different times. This imbalanced process arrival pattern can significantly affect the performance of the collective operation. MPI_Alltoall() and MPI_Allgather() are communication-intensive collective operations that are used in many scientific applications. Therefore, their efficient implementations under different process arrival patterns are critical to the performance of scientific applications running on modern clusters. In this paper, we propose novel RDMA-based process arrival pattern aware MPI_Alltoall() and MPI_Allgather() for different message sizes over InfiniBand clusters. We also extend the algorithms to be shared memory aware for small to medium size messages under process arrival patterns. The performance results indicate that the proposed algorithms outperform the native MVAPICH implementations as well as other non-process arrival pattern aware algorithms when processes arrive at different times. Specifically, the RDMA-based process arrival pattern aware MPI_Alltoall() and MPI_Allgather() are 3.1 times faster than MVAPICH for 8 KB messages. On average, the applications studied in this paper (FT, RADIX, and N-BODY) achieve a speedup of 1.44 using the proposed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

MPI: A Message Passing Interface standard (1997)
Faraj A., Patarasuk P., Yuan X.: A study of process arrival patterns for MPI collective operations. Int. J. Parallel Program. 36(6), 543–570 (2008)
Article Google Scholar
Patarasuk, P., Yuan, X.: Efficient MPI_Bcast across different process arrival patterns. In: Proceedings 22nd International Parallel and Distributed Processing Symposium (IPDPS). (2008)
InfiniBand Architecture, http://www.infinibandta.org
Qian, Y., Afsahi, A.: Process arrival pattern and shared memory aware alltoall on InfiniBand. 16th EuroPVM/MPI Lecture Notes in Computer Science (LNCS 5759), pp. 250–260. (2009)
MVAPICH, http://www.mvapich.cse.ohio-state.edu
ConnectX InfiniBand Adapters, product brief, Mellanox Technologies, Inc. http://www.mellanox.com/pdf/products/hca/ConnectX_IB_Card.pdf
Virtual Protocol Interconnect (VPI), product brief, Mellanox Technologies, Inc. http://www.mellanox.com/related-docs/prod_adapter_cards/PB_ConnectX_VPI.pdf
Bruck J., Ho C.-T., Kipnis S., Upfal E., Weathersby D.: Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Trans. Parallel Distrib. Syst. 8(11), 1143–1156 (1997)
Article Google Scholar
Thakur R., Rabenseifner R., Gropp W.: Optimization of collective communication operations in MPICH. Int. J. High Perform. Comput. Appl. 19(1), 49–66 (2005)
Article Google Scholar
Sur, S., Bondhugula, U.K.R., Mamidala, A., Jin, H.-W., Panda, D.K.: High performance RDMA based all-to-all broadcast for InfiniBand clusters. In: Proceedings 12th International Conference on High Performance Computing (HiPC). (2005)
NAS Benchmarks, version 2.4, http://www.nas.nasa.gov/Resources/Software/npb.html
Qian Y., Afsahi A.: Efficient shared memory and RDMA based collectives on multi-rail QsNet^II SMP clusters. Cluster Comput. J. Networks Softw. Tools Appl. 11(4), 341–354 (2008)
Google Scholar
Tipparaju, V., Nieplocha, J., Panda, D.K.: Fast collective operations using shared and remote memory access protocols on clusters. In: Proceedings 17th International Parallel and Distributed Processing Symposium (IPDPS). (2003)
Qian, Y., Rashti, M.J., Afsahi, A.: Multi-connection and multi-core aware all-gather on InfiniBand clusters. In: Proceedings 20th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS), pp. 245–251. (2008)
OpenFabrics Alliance Homepage, http://www.openfabrics.org
Shan H., Singh J.P., Oliker L., Biswas R.: Message passing and shared address space parallelism on an SMP cluster. Parallel Comput. 29(2), 167–186 (2003)
Article Google Scholar
MPICH, http://www.mcs.anl.gov/research/projects/mpich2
Vadhiyar, S.S., Fagg, G.E., Dongarra, J.: Automatically tuned collective communications. In: Proceedings 2000 ACM/IEEE Conference on Supercomputing (SC). (2000)
Buntinas, D., Mercier, G., Gropp, W.: Data transfers between processes in an SMP system: performance study and application to MPI. In: Procedings 35th International Conference on Parallel Processing (ICPP), pp. 487–496. (2006)
Sistare, S., vandeVaart, R., Loh, E.: Optimization of MPI collectives on clusters of large-scale SMPs. In: Proceedings 1999 ACM/IEEE Conference on Supercomputing (SC). (1999)
Mamidala, A.R., Chai, L., Jin, H.-W., Panda, D.K.: Efficient SMP-aware MPI-level broadcast over InfiniBand’s hardware multicast. Workshop on Communication Architecture on Clusters (CAC). In: Proceedings of 20th International Parallel and Distributed Processing Symposium (IPDPS). Pittsburgh, PA (2006)
Wu, M., Kendall, R.A., Wright, K.: Optimizing collective communications on SMP clusters. In: Proceedings 34th International Conference on Parallel Processing (ICPP), pp. 399–407. (2005)
Traff, J.L.: Efficient allgather for regular SMP-clusters. In: Proceedings EuroPVM/MPI, pp. 58–65. (2006)
Ritzdorf, H., Traff, J.L.: Collective operations in NEC’s high-performance MPI libraries. In: Proceedings 20th International Parallel and Distributed Processing Symposium (IPDPS). (2006)
Sur, S., Jin, H.-W., Panda, D.K.: Efficient and scalable all-to-all personalized exchange for InfiniBand clusters. In: Proceedings 33rd International Conference on Parallel Processing (ICCP), pp. 275–282. (2004)
Mamidala, A.R., Vishnu, A., Panda, D.K.: Efficient shared memory and RDMA based design for MPI-allgather over InfiniBand. In: Proceedings EuroPVM/MPI, pp. 66–75. (2006)
Mamidala, A., Kumar, R., De, D., Panda, D.K.: MPI collectives on modern multicore clusters: performance optimizations and communication characteristics. In: Proceedings 8th International Symposium on Cluster Computing and the Grid (CCGrid). (2008)

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Queen’s University, Kingston, ON, K7L 3N6, Canada
Ying Qian & Ahmad Afsahi

Authors

Ying Qian
View author publications
You can also search for this author in PubMed Google Scholar
Ahmad Afsahi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ahmad Afsahi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qian, Y., Afsahi, A. Process Arrival Pattern Aware Alltoall and Allgather on InfiniBand Clusters. Int J Parallel Prog 39, 473–493 (2011). https://doi.org/10.1007/s10766-010-0152-3

Download citation

Received: 02 November 2009
Accepted: 30 September 2010
Published: 24 October 2010
Issue Date: August 2011
DOI: https://doi.org/10.1007/s10766-010-0152-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Process Arrival Pattern Aware Alltoall and Allgather on InfiniBand Clusters

Abstract

Access this article

Similar content being viewed by others

Sparbit: Towards to a Logarithmic-Cost and Data Locality-Aware MPI Allgather Algorithm

Shared Memory Based MPI Broadcast Algorithms for NUMA Systems

Process arrival pattern aware algorithms for acceleration of scatter and gather operations

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Process Arrival Pattern Aware Alltoall and Allgather on InfiniBand Clusters

Abstract

Access this article

Similar content being viewed by others

Sparbit: Towards to a Logarithmic-Cost and Data Locality-Aware MPI Allgather Algorithm

Shared Memory Based MPI Broadcast Algorithms for NUMA Systems

Process arrival pattern aware algorithms for acceleration of scatter and gather operations

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation