Finepoints: Partitioned Multithreaded MPI Communication

Grant, Ryan E.; Dosanjh, Matthew G. F.; Levenhagen, Michael J.; Brightwell, Ron; Skjellum, Anthony

doi:10.1007/978-3-030-20656-7_17

Ryan E. Grant¹⁸,
Matthew G. F. Dosanjh¹⁸,
Michael J. Levenhagen¹⁸,
Ron Brightwell¹⁸ &
…
Anthony Skjellum¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11501))

Included in the following conference series:

International Conference on High Performance Computing

1329 Accesses
18 Citations

Abstract

The MPI multithreading model has been historically difficult to optimize; the interface that it provides for threads was designed as a process-level interface. This model has led to implementations that treat function calls as critical regions and protect them with locks to avoid race conditions. We hypothesize that an interface designed specifically for threads can provide superior performance than current approaches and even outperform single-threaded MPI.

In this paper, we describe a design for partitioned communication in MPI that we call finepoints. First, we assess the existing communication models for MPI two-sided communication and then introduce finepoints as a hybrid of MPI models that has the best features of each existing MPI communication model. In addition, “partitioned communication” created with finepoints leverages new network hardware features that cannot be exploited with current MPI point-to-point semantics, making this new approach both innovative and useful both now and in the future.

To demonstrate the validity of our hypothesis, we implement a finepoints library and show improvements against a state-of-the-art multithreaded optimized Open MPI implementation on a Cray XC40 with an Aries network. Our experiments demonstrate up to a 12\(\times \) reduction in wait time for completion of send operations. This new model is shown working on a nuclear reactor physics neutron-transport proxy-application, providing up to 26.1% improvement in communication time and up to 4.8% improvement in runtime over the best performing MPI communication mode, single-threaded MPI.

Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned subsidiary of Honeywell International Inc. for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Barrett, B.W., Brightwell, R., et al.: The Portals 4.1 networking programming interface. Technical report SAND2017-3825, Sandia National Laboratories (SNL-NM), Albuquerque, NM, United States (2017)
Google Scholar
Bernholdt, D.E., Boehm, S., et al.: A survey of MPI usage in the U.S. Exascale Computing Project. Concurr. Comput. Pract. Exp. (2018)
Google Scholar
Derradji, S. Palfer-Sollier, T., et al.: The BXI interconnect architecture. In: Proceedings of the 23rd Annual Symposium on High Performance Interconnects, HOTI 2015. IEEE (2015)
Google Scholar
Dimitrov, R., Skjellum, A.: Software architecture and performance comparison of MPI/Pro and MPICH. In: Sloot, P.M.A., Abramson, D., Bogdanov, A.V., Gorbachev, Y.E., Dongarra, J.J., Zomaya, A.Y. (eds.) ICCS 2003, Part III. LNCS, vol. 2659, pp. 307–315. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-44863-2_31
Chapter Google Scholar
Dinan, J., Grant, R.E., et al.: Enabling communication concurrency through flexible MPI endpoints. Int. J. High Perform. Comput. Appl. 28(4), 390–405 (2014)
Article Google Scholar
Doerfler, D.W., Rajan, M., et al.: A comparison of the performance characteristics of capability and capacity class HPC systems. Technical report, Sandia National Lab. (SNL-NM), Albuquerque, NM, United States (2011)
Google Scholar
Dosanjh, M.G.F., Grant, R.E., et al.: Re-evaluating network onload vs. offload for the many-core era. In: IEEE International Conference on Cluster Computing (CLUSTER), pp. 342–350. IEEE (2015)
Google Scholar
Dosanjh, M.G.F., Groves, T., et al.: RMA-MT: a benchmark suite for assessing MPI multi-threaded RMA performance. In: 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 550–559. IEEE (2016)
Google Scholar
Grant, R.E., Rashti, M.J., et al.: RDMA capable iWARP over datagrams. In: IEEE International Parallel & Distributed Processing Symposium (IPDPS), pp. 628–639. IEEE (2011)
Google Scholar
Gunow, G., Tramm, J.R., et al.: SimpleMOC - a performance abstraction for 3D MOC. In: ANS MC2015. American Nuclear Society, American Nuclear Society (2015)
Google Scholar
Heroux, M.A., Doerfler, D.W., et al.: Improving performance via mini-applications. Sandia National Laboratories, Technical report SAND2009-5574, vol. 3 (2009)
Google Scholar
Hjelm, N., Dosanjh, M.G.F., et al.: Improving MPI multi-threaded RMA communication performance. In: Proceedings of the International Conference on Parallel Processing, pp. 1–10 (2018)
Google Scholar
Kamal, H., Wagner, A.: An integrated fine-grain runtime system for MPI. Computing 96(4), 293–309 (2014). ISSN: 0010-485X
Article Google Scholar
Mendygral, P., Radcliffe, N., et al.: WOMBAT: a scalable and high-performance astrophysical magnetohydrodynamics code. Astrophys. J. Suppl. Ser. 228(2), 23 (2017)
Article Google Scholar
MPI Forum. MPI: A message-passing interface standard version 3.1. Technical report, University of Tennessee, Knoxville (2015)
Google Scholar
Petrini, F., Kerbyson, D.J., et al.: The case of the missing supercomputer performance: achieving optimal performance on the 8,192 processors of ASCI Q. In: Proceedings of the 2003 ACM/IEEE conference on Supercomputing, p. 55 (2003)
Google Scholar
Rashti, M.J., Grant, R.E., et al.: iWARP redefined: scalable connectionless communication over high-speed Ethernet. In: International Conference on High Performance Computing (HiPC), pp. 1–10. IEEE (2010)
Google Scholar
Schneider, T., Hoefler, T., et al.: Protocols for fully offloaded collective operations on accelerated network adapters. In: 42nd International Conference on Parallel Processing (ICPP 2013), Lyon, France, October 2013
Google Scholar
Weeks, H., Dosanjh, M.G.F., Bridges, P.G., Grant, R.E.: SHMEM-MT: a benchmark suite for assessing multi-threaded SHMEM performance. In: Gorentla Venkata, M., Imam, N., Pophale, S., Mintz, T.M. (eds.) OpenSHMEM 2016. LNCS, vol. 10007, pp. 227–231. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50995-2_16
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Sandia National Laboratories, Albuquerque, USA
Ryan E. Grant, Matthew G. F. Dosanjh, Michael J. Levenhagen & Ron Brightwell
University of Tennessee at Chattanooga, Chattanooga, USA
Anthony Skjellum

Authors

Ryan E. Grant
View author publications
You can also search for this author in PubMed Google Scholar
Matthew G. F. Dosanjh
View author publications
You can also search for this author in PubMed Google Scholar
Michael J. Levenhagen
View author publications
You can also search for this author in PubMed Google Scholar
Ron Brightwell
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Skjellum
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ryan E. Grant .

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, UK
Michèle Weiland
Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
Guido Juckeland
Technical University of Munich, Munich, Germany
Carsten Trinitis
Ohio State University, Columbus, USA
Ponnuswamy Sadayappan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Grant, R.E., Dosanjh, M.G.F., Levenhagen, M.J., Brightwell, R., Skjellum, A. (2019). Finepoints: Partitioned Multithreaded MPI Communication. In: Weiland, M., Juckeland, G., Trinitis, C., Sadayappan, P. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11501. Springer, Cham. https://doi.org/10.1007/978-3-030-20656-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-20656-7_17
Published: 17 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20655-0
Online ISBN: 978-3-030-20656-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics