Mitigating MPI Message Matching Misery

Flajslik, Mario; Dinan, James; Underwood, Keith D.

doi:10.1007/978-3-319-41321-1_15

Mitigating MPI Message Matching Misery

Mario Flajslik¹⁶,
James Dinan¹⁶ &
Keith D. Underwood¹⁶

Conference paper
First Online: 15 June 2016

2948 Accesses
19 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9697))

Abstract

To satisfy MPI ordering semantics in the presence of wildcards, current implementations store posted receive operations and unexpected messages in linked lists. As applications scale up, communication patterns that scale with the number of processes or the number of threads per process can cause those linked lists to grow and become a performance problem. We propose new structures and matching algorithms to address these performance challenges. Our scheme utilizes a hash map that is extended with message ordering annotations to significantly reduce time spent searching for matches in the posted receive and the unexpected message structures. At the same time, we maintain the required MPI ordering semantics, even in the presence of wildcards. We evaluate our approach on several benchmarks and demonstrate a significant reduction in the number of unsuccessful match attempts in the MPI message processing engine, while at the same time incurring low space and time overheads.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S., et al.: The NAS parallel benchmarks. Intl. J. High Perform. Comput. Appl. 5(3), 63–73 (1991)
Article Google Scholar
Barrett, B.W., Brightwell, R., Hemmert, S., Pedretti, K., Wheeler, K., Underwood, K., Riesen, R., Maccabe, A.B., Hudson, T.: The portals 4.0.2 network programming interface. Technical report SAND2013-3181, Sandia National Laboratories, April 2013
Google Scholar
Barrett, B.W., Hammond, S.D., Brightwell, R., Hemmert, K.S.: The impact of hybrid-core processors on mpi message rate. In: Proceedings of 20th European MPI Users’ Group Meeting, EuroMPI 2013, pp. 67–71 (2013)
Google Scholar
Brightwell, R., Goudy, S., Underwood, K.: A preliminary analysis of the MPI queue characterisitics of several applications. In: International Conference on Parallel Processing, ICPP 2005, pp. 175–183, June 2005
Google Scholar
Brightwell, R., Pedretti, K., Ferreira, K.: Instrumentation and analysis of MPI queue times on the SeaStar high-performance network. In: Proceedings of 17th International Conference on Computer Communications and Networks, ICCCN 2008, pp. 1–7, August 2008
Google Scholar
Brightwell, R., Riesen, R., Maccabe, A.B.: Design, implementation, and performance of MPI on Portals 3.0. Int. J. High Perform. Comput. Appl. 17(1), 7–19 (2003)
Article MATH Google Scholar
Brightwell, R., Underwood, K.D.: An analysis of NIC resource usage for offloading MPI. In: Proceedings of Workshop on Communication Architecture for Clusters (2004)
Google Scholar
Buntinas, D., Mercier, G., Gropp, W.: Design and evaluation of nemesis, a scalable, low-latency, message-passing communication subsystem. In: Proceedings of 6th International Symposium on Cluster Computing and the Grid, CCGrid, vol. 1, pp. 521–530, May 2006
Google Scholar
Dózsa, G., Kumar, S., Balaji, P., Buntinas, D., Goodell, D., Gropp, W., Ratterman, J., Thakur, R.: Enabling concurrent multithreaded MPI communication on multicore petascale systems. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds.) EuroMPI 2010. LNCS, vol. 6305, pp. 11–20. Springer, Heidelberg (2010)
Chapter Google Scholar
Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: goals, concept, and design of a next generation MPI implementation. In: Proceedings of 11th European PVM/MPI Users’ Group Meeting, pp. 97–104. Budapest, Hungary, September 2004
Google Scholar
Intel Corporation: Intel\(^{\textregistered }\) True Scale Fabric Architecture: Enhanced HPC Architecture and Performance (2013)
Google Scholar
Jin, H.W., Sur, S., Chai, L., Panda, D.: LiMIC: support for high-performance MPI intra-node communication on Linux cluster. In: Proceedings of International Conference on Parallel Processing, pp. 184–191, June 2005
Google Scholar
Keller, R., Graham, R.L.: Characteristics of the unexpected message queue of MPI applications. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds.) EuroMPI 2010. LNCS, vol. 6305, pp. 179–188. Springer, Heidelberg (2010)
Chapter Google Scholar
McGrattan, K., Hostikka, S., Floyd, J.E.: Fire Dynamics Simulator, User’s Guide. NIST Special Publication 1019 (2013)
Google Scholar
Mellanox Technologies: Mellanox HPC-X™ Software Toolkit User Manual (2014)
Google Scholar
MPI Forum: MPI: a message-passing interface standard version 3.0. Technical report, University of Tennessee, Knoxville, October 2012
Google Scholar
MPI Forum Point-to-Point Working Group: No wildcards proposal to the MPI Forum, April 2015. https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/461
MPICH: A high performance and widely portable implementation of the MPI standard, April 2015. http://www.mpich.org
Müller, M.S., van Waveren, M., Lieberman, R., Whitney, B., Saito, H., Kumaran, K., Baron, J., Brantley, W.C., Parrott, C., Elken, T., Feng, H., Ponder, C.: SPEC MPI2007-an application benchmark suite for parallel systems using MPI. Concurr. Comput.: Pract. Exp. 22(2), 191–205 (2010)
Google Scholar
Open Fabrics Alliance: Open fabric interfaces (OFI), April 2015. http://ofiwg.github.io/libfabric/
Plimpton, S.: Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117(1), 1–19 (1995)
Article MATH Google Scholar
Plimpton, S., Pollock, R., Stevens, M.: Particle-Mesh Ewald and rRESPA for parallel molecular dynamics simulations. In: Proceedings of 8th SIAM Conference on Parallel Processing for Scientific Computing, PPSC (1997)
Google Scholar
Squyres, J.: The message passing interface (MPI). In: Gavrilovska, A. (ed.) Attaining High Performance Communications: A Vertical Approach, pp. 251–280. CRC Press, Boca Raton (2009)
Chapter Google Scholar
Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of collective communication operations in MPICH. Int. J. High Perform. Comput. Appl. 19(1), 49–66 (2005)
Article Google Scholar
Underwood, K., Brightwell, R.: The impact of MPI queue usage on message latency. In: International Conference on Parallel Processing, ICPP 2004, vol. 1, pp. 152–160, August 2004
Google Scholar
Underwood, K.D., Hemmert, K.S., Rodrigues, A., Murphy, R., Brightwell, R.: A hardware acceleration unit for MPI queue processing. In: Proceedings of 19th International Parallel and Distributed Processing Symposium 2005, p. 96b (2005)
Google Scholar
Zounmevo, J.A., Afsahi, A.: An efficient MPI message queue mechanism for large-scale jobs. In: Proceedings of 18th International Conference on Parallel and Distributed Systems, ICPADS 2012, pp. 464–471 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Intel Corporation, Hudson, MA, USA
Mario Flajslik, James Dinan & Keith D. Underwood

Authors

Mario Flajslik
View author publications
You can also search for this author in PubMed Google Scholar
James Dinan
View author publications
You can also search for this author in PubMed Google Scholar
Keith D. Underwood
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Mario Flajslik or James Dinan .

Editor information

Editors and Affiliations

Deutsches Klimarechenzentrum, Hamburg, Germany
Julian M. Kunkel
Argonne National Laboratory, Lemont, Illinois, USA
Pavan Balaji
University of Tennessee, Knoxville, Tennessee, USA
Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Flajslik, M., Dinan, J., Underwood, K.D. (2016). Mitigating MPI Message Matching Misery. In: Kunkel, J., Balaji, P., Dongarra, J. (eds) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science(), vol 9697. Springer, Cham. https://doi.org/10.1007/978-3-319-41321-1_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-41321-1_15
Published: 15 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41320-4
Online ISBN: 978-3-319-41321-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics