A Scalable MPI_Comm_split Algorithm for Exascale Computing

Sack, Paul; Gropp, William

doi:10.1007/978-3-642-15646-5_1

A Scalable MPI_Comm_split Algorithm for Exascale Computing

Paul Sack²⁰ &
William Gropp²⁰

Conference paper

1123 Accesses
8 Citations

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 6305))

Abstract

Existing algorithms for creating communicators in MPI programs will not scale well to future exascale supercomputers containing millions of cores. In this work, we present a novel communicator-creation algorithm that does scale well into millions of processes using three techniques: replacing the sorting at the end of MPI_Comm_split with merging as the color and key table is built, sorting the color and key table in parallel, and using a distributed table to store the output communicator data rather than a replicated table. This reduces the time cost of MPI_Comm_split in the worst case we consider from 22 seconds to 0.37 second. Existing algorithms build a table with as many entries as processes, using vast amounts of memory. Our algorithm uses a small, fixed amount of memory per communicator after MPI_Comm_split has finished and uses a fraction of the memory used by the conventional algorithm for temporary storage during the execution of MPI_Comm_split.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cheng, D.R., Edelman, A., Gilbert, J.R., Shah, V.: A novel parallel sorting algorithm for contemporary architectures. Submitted to ALENEX 2006 (2006)
Google Scholar
Choudhury, N., Mehta, Y., Wilmarth, T.L., Bohm, E.J., Kalé, L.V.: Scaling an optimistic parallel simulation of large-scale interconnection networks. In: WSC 2005: Proceedings of the 37th conference on Winter simulation,Winter Simulation Conference, pp. 591–600 (2005)
Google Scholar
Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, concept, and design of a next generation MPI implementation. In: Proceedings, 11th European PVM/MPI Users’ Group Meeting, Budapest, Hungary, September 2004, pp. 97–104 (2004)
Google Scholar
Gropp, W., Lusk, E., Doss, N., Skjellum, A.: A high-performance, portable implementation of the MPI message passing interface standard. Parallel Computing 22(6), 789–828 (1996)
Article MATH Google Scholar
Gropp, W.D., Lusk, E.: User’s Guide for mpich, a Portable Implementation of MPI. Mathematics and Computer Science Division. Argonne National Laboratory (1996), ANL-96/6
Google Scholar
Meuer, H., Strohmaier, E., Dongarra, J., Simon, H.: TOP500 Supercomputing Sites (2010), http://top500.org (accessed March 24, 2010)
Saukas, E.L.G., Song, S.W.: A note on parallel selection on coarse grained multicomputers. Algorithmica 24, 371–380 (1999)
Article MathSciNet Google Scholar
Shi, H., Schaeffer, J.: Parallel sorting by regular sampling. J. Parallel Distrib. Comput. 14(4), 361–372 (1992)
Article MATH Google Scholar
Sutter, H.: The free lunch is over: A fundamental turn toward concurrency in software. Dr. Dobb’s Journal 30(3) (March 2005)
Google Scholar
Thakur, R., Gropp, W.: Improving the performance of collective operations in mpich. In: Dongarra, J., Laforenza, D., Orlando, S. (eds.) EuroPVM/MPI 2003. LNCS, vol. 2840, pp. 257–267. Springer, Heidelberg (2003)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

University of Illinois at Urbana-Champaign,
Paul Sack & William Gropp

Authors

Paul Sack
View author publications
You can also search for this author in PubMed Google Scholar
William Gropp
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

High Performance Computing Center Stuttgart (HLRS), Universität Stuttgart, Nobelstr. 19, 70569, Stuttgart, Germany
Rainer Keller
Parallel Software Technologies Laboratory, Department of Computer Science, University of Houston,
Edgar Gabriel
High Performance Computing Center Stuttgart, University of Stuttgart, Nobelstr. 19, 70569, Stuttgart, Germany
Michael Resch
Department of Electrical Engineering and Computer Science, University of Tennessee, 37996-3450, Knoxville, TN, USA
Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sack, P., Gropp, W. (2010). A Scalable MPI_Comm_split Algorithm for Exascale Computing. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds) Recent Advances in the Message Passing Interface. EuroMPI 2010. Lecture Notes in Computer Science, vol 6305. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15646-5_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-15646-5_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15645-8
Online ISBN: 978-3-642-15646-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics