Skip to main content
Log in

Process Arrival Pattern Aware Alltoall and Allgather on InfiniBand Clusters

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Recent studies show that MPI processes in real applications could arrive at an MPI collective operation at different times. This imbalanced process arrival pattern can significantly affect the performance of the collective operation. MPI_Alltoall() and MPI_Allgather() are communication-intensive collective operations that are used in many scientific applications. Therefore, their efficient implementations under different process arrival patterns are critical to the performance of scientific applications running on modern clusters. In this paper, we propose novel RDMA-based process arrival pattern aware MPI_Alltoall() and MPI_Allgather() for different message sizes over InfiniBand clusters. We also extend the algorithms to be shared memory aware for small to medium size messages under process arrival patterns. The performance results indicate that the proposed algorithms outperform the native MVAPICH implementations as well as other non-process arrival pattern aware algorithms when processes arrive at different times. Specifically, the RDMA-based process arrival pattern aware MPI_Alltoall() and MPI_Allgather() are 3.1 times faster than MVAPICH for 8 KB messages. On average, the applications studied in this paper (FT, RADIX, and N-BODY) achieve a speedup of 1.44 using the proposed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. MPI: A Message Passing Interface standard (1997)

  2. Faraj A., Patarasuk P., Yuan X.: A study of process arrival patterns for MPI collective operations. Int. J. Parallel Program. 36(6), 543–570 (2008)

    Article  Google Scholar 

  3. Patarasuk, P., Yuan, X.: Efficient MPI_Bcast across different process arrival patterns. In: Proceedings 22nd International Parallel and Distributed Processing Symposium (IPDPS). (2008)

  4. InfiniBand Architecture, http://www.infinibandta.org

  5. Qian, Y., Afsahi, A.: Process arrival pattern and shared memory aware alltoall on InfiniBand. 16th EuroPVM/MPI Lecture Notes in Computer Science (LNCS 5759), pp. 250–260. (2009)

  6. MVAPICH, http://www.mvapich.cse.ohio-state.edu

  7. ConnectX InfiniBand Adapters, product brief, Mellanox Technologies, Inc. http://www.mellanox.com/pdf/products/hca/ConnectX_IB_Card.pdf

  8. Virtual Protocol Interconnect (VPI), product brief, Mellanox Technologies, Inc. http://www.mellanox.com/related-docs/prod_adapter_cards/PB_ConnectX_VPI.pdf

  9. Bruck J., Ho C.-T., Kipnis S., Upfal E., Weathersby D.: Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Trans. Parallel Distrib. Syst. 8(11), 1143–1156 (1997)

    Article  Google Scholar 

  10. Thakur R., Rabenseifner R., Gropp W.: Optimization of collective communication operations in MPICH. Int. J. High Perform. Comput. Appl. 19(1), 49–66 (2005)

    Article  Google Scholar 

  11. Sur, S., Bondhugula, U.K.R., Mamidala, A., Jin, H.-W., Panda, D.K.: High performance RDMA based all-to-all broadcast for InfiniBand clusters. In: Proceedings 12th International Conference on High Performance Computing (HiPC). (2005)

  12. NAS Benchmarks, version 2.4, http://www.nas.nasa.gov/Resources/Software/npb.html

  13. Qian Y., Afsahi A.: Efficient shared memory and RDMA based collectives on multi-rail QsNetII SMP clusters. Cluster Comput. J. Networks Softw. Tools Appl. 11(4), 341–354 (2008)

    Google Scholar 

  14. Tipparaju, V., Nieplocha, J., Panda, D.K.: Fast collective operations using shared and remote memory access protocols on clusters. In: Proceedings 17th International Parallel and Distributed Processing Symposium (IPDPS). (2003)

  15. Qian, Y., Rashti, M.J., Afsahi, A.: Multi-connection and multi-core aware all-gather on InfiniBand clusters. In: Proceedings 20th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS), pp. 245–251. (2008)

  16. OpenFabrics Alliance Homepage, http://www.openfabrics.org

  17. Shan H., Singh J.P., Oliker L., Biswas R.: Message passing and shared address space parallelism on an SMP cluster. Parallel Comput. 29(2), 167–186 (2003)

    Article  Google Scholar 

  18. MPICH, http://www.mcs.anl.gov/research/projects/mpich2

  19. Vadhiyar, S.S., Fagg, G.E., Dongarra, J.: Automatically tuned collective communications. In: Proceedings 2000 ACM/IEEE Conference on Supercomputing (SC). (2000)

  20. Buntinas, D., Mercier, G., Gropp, W.: Data transfers between processes in an SMP system: performance study and application to MPI. In: Procedings 35th International Conference on Parallel Processing (ICPP), pp. 487–496. (2006)

  21. Sistare, S., vandeVaart, R., Loh, E.: Optimization of MPI collectives on clusters of large-scale SMPs. In: Proceedings 1999 ACM/IEEE Conference on Supercomputing (SC). (1999)

  22. Mamidala, A.R., Chai, L., Jin, H.-W., Panda, D.K.: Efficient SMP-aware MPI-level broadcast over InfiniBand’s hardware multicast. Workshop on Communication Architecture on Clusters (CAC). In: Proceedings of 20th International Parallel and Distributed Processing Symposium (IPDPS). Pittsburgh, PA (2006)

  23. Wu, M., Kendall, R.A., Wright, K.: Optimizing collective communications on SMP clusters. In: Proceedings 34th International Conference on Parallel Processing (ICPP), pp. 399–407. (2005)

  24. Traff, J.L.: Efficient allgather for regular SMP-clusters. In: Proceedings EuroPVM/MPI, pp. 58–65. (2006)

  25. Ritzdorf, H., Traff, J.L.: Collective operations in NEC’s high-performance MPI libraries. In: Proceedings 20th International Parallel and Distributed Processing Symposium (IPDPS). (2006)

  26. Sur, S., Jin, H.-W., Panda, D.K.: Efficient and scalable all-to-all personalized exchange for InfiniBand clusters. In: Proceedings 33rd International Conference on Parallel Processing (ICCP), pp. 275–282. (2004)

  27. Mamidala, A.R., Vishnu, A., Panda, D.K.: Efficient shared memory and RDMA based design for MPI-allgather over InfiniBand. In: Proceedings EuroPVM/MPI, pp. 66–75. (2006)

  28. Mamidala, A., Kumar, R., De, D., Panda, D.K.: MPI collectives on modern multicore clusters: performance optimizations and communication characteristics. In: Proceedings 8th International Symposium on Cluster Computing and the Grid (CCGrid). (2008)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmad Afsahi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qian, Y., Afsahi, A. Process Arrival Pattern Aware Alltoall and Allgather on InfiniBand Clusters. Int J Parallel Prog 39, 473–493 (2011). https://doi.org/10.1007/s10766-010-0152-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-010-0152-3

Keywords

Navigation