Improving the Performance of Collective Operations in MPICH

Thakur, Rajeev; Gropp, William D.

doi:10.1007/978-3-540-39924-7_38

Rajeev Thakur⁷ &
William D. Gropp⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2840))

Included in the following conference series:

European Parallel Virtual Machine / Message Passing Interface Users’ Group Meeting

902 Accesses
103 Citations

Abstract

We report on our work on improving the performance of collective operations in MPICH on clusters connected by switched networks. For each collective operation, we use multiple algorithms depending on the message size, with the goal of minimizing latency for short messages and minimizing bandwidth usage for long messages. Although we have implemented new algorithms for all MPI collective operations, because of limited space we describe only the algorithms for allgather, broadcast, reduce-scatter, and reduce. We present performance results using the SKaMPI benchmark on a Myrinet-connected Linux cluster and an IBM SP. In all cases, the new algorithms significantly outperform the old algorithms used in MPICH on the Myrinet cluster, and, in many cases, they outperform the algorithms used in IBM’s MPI on the SP.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barnett, M., Gupta, S., Payne, D., Shuler, L., van de Geijn, R., Watts, J.: Interprocessor collective communication library (InterCom). In: Proceedings of Supercomputing 1994 (November 1994)
Google Scholar
Barnett, M., Littlefield, R., Payne, D., van de Geijn, R.: Global combine on mesh architectures with wormhole routing. In: Proceedings of the 7th International Parallel Processing Symposium (April 1993)
Google Scholar
Bokhari, S.: Complete exchange on the iPSC/860. Technical Report 91–4, ICASE, NASA Langley Research Center (1991)
Google Scholar
Bokhari, S., Berryman, H.: Complete exchange on a circuit switched mesh. In: Proceedings of the Scalable High Performance Computing Conference, pp. 300– 306 (1992)
Google Scholar
Hensgen, D., Finkel, R., Manbet, U.: Two algorithms for barrier synchronization. International Journal of Parallel Programming 17(1), 1–17 (1988)
Article MATH Google Scholar
Kale, L.V., Kumar, S., Vardarajan, K.: A framework for collective personalized communication. In: Proceedings of the 17th International Parallel and Distributed Processing Symposium, IPDPS 2003 (2003)
Google Scholar
Karonis, N., de Supinski, B., Foster, I., Gropp, W., Lusk, E., Bresnahan, J.: Exploiting hierarchy in parallel computer networks to optimize collective operation performance. In: Proceedings of the Fourteenth International Parallel and Distributed Processing Symposium (IPDPS 2000), pp. 377–384 (2000)
Google Scholar
Kielmann, T., Hofman, R.F.H., Bal, H.E., Plaat, A., Bhoedjang, R.A.F.: Mag-PIe: MPI’s collective communication operations for clustered wide area systems. In: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 1999), May 1999, pp. 131–140. ACM Press, New York (1999)
Chapter Google Scholar
Mitra, P., Payne, D., Shuler, L., van de Geijn, R., Watts, J.: Fast collective communication libraries, please. In: Proceedings of the Intel Supercomputing Users’ Group Meeting (June 1995)
Google Scholar
Rabenseifner, R.: Effective bandwidth (b_eff) benchmark, http://www.hlrs.de/mpi/beff
Rabenseifner, R.: New optimized MPI reduce algorithm, http://www.hlrs.de/organization/par/services/models/mpi/myreduce.html
Sanders, P., Träff, J.L.: The hierarchical factor algorithm for all-toall communication. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, pp. 799–803. Springer, Heidelberg (2002)
Chapter Google Scholar
Scott, D.: Efficient all-to-all communication patterns in hypercube and mesh topologies. In: Proceedings of the 6th Distributed Memory Computing Conference, pp. 398–403 (1991)
Google Scholar
Shroff, M., van de Geijn, R.A.: CollMark: MPI collective communication benchmark. Technical report, Dept. of Computer Sciences, University of Texas at Austin (December 1999)
Google Scholar
Sistare, S., vandeVaart, R., Loh, E.: Optimization of MPI collectives on clusters of large-scale SMPs. In: Proceedings of SC 1999: High Performance Networking and Computing (November 1999)
Google Scholar
Tipparaju, V., Nieplocha, J., Panda, D.K.: Fast collective operations using shared and remote memory access protocols on clusters. In: Proceedings of the 17th International Parallel and Distributed Processing Symposium, IPDPS 2003 (2003)
Google Scholar
Träff, J.L.: Improved MPI all-to-all communication on a Giganet SMP cluster. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J., Volkert, J. (eds.) PVM/MPI 2002. LNCS, vol. 2474, pp. 392–400. Springer, Heidelberg (2002)
Chapter Google Scholar
Vadhiyar, S.S., Fagg, G.E., Dongarra, J.: Automatically tuned collective communications. In: Proceedings of SC 1999: High Performance Networking and Computing (November 1999)
Google Scholar
Worsch, T., Reussner, R., Augustin, W.: On benchmarking collective MPI operations. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J., Volkert, J. (eds.) PVM/MPI 2002. LNCS, vol. 2474, pp. 271–279. Springer, Heidelberg (2002)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Argonne National Laboratory, Mathematics and Computer Science Division, 9700 S. Cass Avenue, Argonne, IL, 60439, USA
Rajeev Thakur & William D. Gropp

Authors

Rajeev Thakur
View author publications
You can also search for this author in PubMed Google Scholar
William D. Gropp
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, University of Tennessee, 37996-3450, Knoxville, TN, USA
Jack Dongarra
Information Science and Technologies Institute (ISTI), The Italian National Research Council (CNR), Area della Ricerca, Via Giuseppe Moruzzi, 1, I-56126, Pisa, Italy
Domenico Laforenza
Department of Computer Science, Ca’ Foscari University of Venice, Italy
Salvatore Orlando

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Thakur, R., Gropp, W.D. (2003). Improving the Performance of Collective Operations in MPICH. In: Dongarra, J., Laforenza, D., Orlando, S. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2003. Lecture Notes in Computer Science, vol 2840. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39924-7_38

Download citation

DOI: https://doi.org/10.1007/978-3-540-39924-7_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20149-6
Online ISBN: 978-3-540-39924-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics