MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory

Hoefler, Torsten; Dinan, James; Buntinas, Darius; Balaji, Pavan; Barrett, Brian; Brightwell, Ron; Gropp, William; Kale, Vivek; Thakur, Rajeev

doi:10.1007/s00607-013-0324-2

MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory

Published: 19 May 2013

Volume 95, pages 1121–1136, (2013)
Cite this article

Computing Aims and scope Submit manuscript

Torsten Hoefler¹,
James Dinan²,
Darius Buntinas²,
Pavan Balaji²,
Brian Barrett³,
Ron Brightwell³,
William Gropp⁴,
Vivek Kale⁴ &
…
Rajeev Thakur²

1956 Accesses
54 Citations
1 Altmetric
Explore all metrics

Abstract

Hybrid parallel programming with the message passing interface (MPI) for internode communication in conjunction with a shared-memory programming model to manage intranode parallelism has become a dominant approach to scalable parallel programming. While this model provides a great deal of flexibility and performance potential, it saddles programmers with the complexity of utilizing two parallel programming systems in the same application. We introduce an MPI-integrated shared-memory programming model that is incorporated into MPI through a small extension to the one-sided communication interface. We discuss the integration of this interface with the MPI 3.0 one-sided semantics and describe solutions for providing portable and efficient data sharing, atomic operations, and memory consistency. We describe an implementation of the new interface in the MPICH2 and Open MPI implementations and demonstrate an average performance improvement of 40 % to the communication component of a five-point stencil solver.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance improvement of the triangular matrix product in commodity clusters

Article Open access 15 April 2024

Efficient High-Level Programming in Plain Java

Article 05 December 2022

Shared Memory Parallelism in Modern C++ and HPX

Article 20 April 2024

References

MPI Forum (2012) MPI: a message-passing interface standard. version 3.0
Smith L, Bull M (2001) Development of mixed mode MPI/OpenMP applications. Sci Program 9(2,3):83–98
Google Scholar
Rabenseifner R, Hager G, Jost G (2009) Hybrid MPI/OpenMP parallel programming on clusters of multi-core SMP nodes. In: Proceedings of the 17th Euromicro international conference on parallel, distributed and network-based processing
Demaine E (1997) A threads-only MPI implementation for the development of parallel programs. In: Proceedings of the 11th international symposium on HPC systems. pp 153–163
Bhargava P (1997) MPI-LITE: multithreading support for MPI. http://pcl.cs.ucla.edu/projects/sesame/mpi_lite/mpi_lite.html
Shen K, Tang H, Yang T (1999) Adaptive two-level thread management for fast MPI execution on shared memory machines. In: Proceedings of the ACM/IEEE conference on supercomputing
Tang H, Shen K, Yang T (2000) Program transformation and runtime support for threaded MPI execution on shared memory machines. ACM Trans Program Lang Syst 22:673–700
Article Google Scholar
Pérachec M, Carribault P, Jourdren H (2009) MPC-MPI: an MPI implementation reducing the overall memory consumption. In: Proceedings of EuroPVM/MPI 2009, Springer, pp 94–103
Shirley D (2000) Enhancing MPI applications through selective use of shared memory on SMPs. In: Proceedings of the 1st SIAM conference on CSE
Los Alamos National Laboratory (2001) Unified parallel software users’ guide and reference manual. http://public.lanl.gov/ups/Doc_Directory/UserGuide/UserGuide.pdf
Hoefler T, Dinan J, Buntinas D, Balaji P, Barrett B, Brightwell R, Gropp W, Kale V, Thakur R (2012) Leveraging MPIs one-sided communication interface for shared-memory programming. In: Träff J, Benkner S, Dongarra J (eds) Recent advances in the message passing interface. vol 7490, pp 132–141
Taft JR (2001) Achieving 60 GFLOP/s on the production CFD code OVERFLOW-MLP. Parallel Comput 27(4):521–536
Article MATH Google Scholar
Boehm HJ (2005) Threads cannot be implemented as a library. In: Proceedings of the 2005 ACM SIGPLAN conference on programming language design and implementation. PLDI ’05, New York, NY, USA, ACM pp 261–268
Boehm HJ, Adve SV (2012) You do not know jack about shared variables or memory models. Commun. ACM 55(2):48–54
Article Google Scholar
Aho AV, Sethi R, Ullman JD (1986) Compilers: principles, techniques, and tools. Addison-Wesley Longman Publishing Co. Inc., Boston
Google Scholar
Manson J, Pugh W, Adve SV (2005) The Java memory model. In: Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on principles of programming languages. POPL ’05, New York, ACM pp 378–391
Boehm HJ, Adve SV (2008) Foundations of the C++ concurrency memory model. SIGPLAN Not 43(6):68–78
Article Google Scholar
Lee EA (2006) The problem with threads. Computer 39(5):33–42
Article Google Scholar
Heroux MA, Brightwell R, Wolf MM (2011) Bi-modal MPI and MPI+threads computing on scalable multicore systems. IJHPCA (Submitted)
Sandia National Laboratories (2012) Mantevo project. http://www.mantevo.org
Saad Y (2003) Iterative methods for sparse linear systems. Society for Industrial and Applied Mathematics
Saltz JH (1990) Aggregation methods for solving sparse triangular systems on multiprocessors. SIAM J Sci Stat Comput 11(1):123–144
Article MathSciNet MATH Google Scholar
Wolf MM, Heroux MA, Boman EG (2010) Factors impacting performance of multithreaded sparse triangular solve. Technical report SAND2010-0331 presented at VECPAR’10
Esler KP, Kim J, Ceperley DM, Purwanto W, Walter EJ, Krakauer H, Zhang S, Kent PRC, Hennig RG, Umrigar C, Bajdich M, Koloren J, Mitas L, Srinivasan A (2008) Quantum monte carlo algorithms for electronic structure at the petascale; the endstation project. J Phys 125(1):012057
Google Scholar
Wagner LK, Bajdich M, Mitas L (2009) Qwalk: a quantum monte carlo program for electronic structure. J Comput Phys 228(9):3390–3404
Article Google Scholar
Esler KP Einspline libaray. Online: http://einspline.svn.sourceforge.net/
Niu Q, Dinan J, Tirukkovalur S, Mitas L, Wagner L, Sadayappan P (2012) A global address space approach to automated data management for parallel quantum Monte Carlo applications. In: Proceedings 19th international conference on high performance computing. HiPC’12
Smith L, Kent P (2000) Development and performance of a mixed OpenMP/MPI quantum Monte Carlo code. Concurr Pract Exp 12(12):1121–1129
Article MATH Google Scholar
Esler KP, Kim J, Ceperley DM, Shulenburger L (2012) Accelerating quantum Monte Carlo simulations of real materials on GPU clusters. Comput Sci Eng 14(1):40–51
Article Google Scholar

Download references

Acknowledgments

We thank the members of the MPI Forum and the MPI community for their efforts in creating the MPI 3.0 specification. In addition, we thank Jeff RḢammond for reviewing a draft of this article. This work was supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research, under Contract DE-AC02-06CH11357, under award number DE-FC02-10ER26011 with program manager Sonia Sachs, under award number DE-FG02-08ER25835, and as part of the Extreme-scale Algorithms and Software Institute (EASI) by the Department of Energy, Office of Science, U.S. DOE award DE-SC0004131. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energys National Nuclear Security Administration, under contract DE-AC-94AL85000.

Author information

Authors and Affiliations

ETH Zurich, Zurich, Switzerland
Torsten Hoefler
Argonne National Laboratory, Argonne, IL, USA
James Dinan, Darius Buntinas, Pavan Balaji & Rajeev Thakur
Sandia National Laboratories, Sandia, NM, USA
Brian Barrett & Ron Brightwell
University of Illinois at Urbana-Champaign, Urbana, IL, USA
William Gropp & Vivek Kale

Authors

Torsten Hoefler
View author publications
You can also search for this author in PubMed Google Scholar
James Dinan
View author publications
You can also search for this author in PubMed Google Scholar
Darius Buntinas
View author publications
You can also search for this author in PubMed Google Scholar
Pavan Balaji
View author publications
You can also search for this author in PubMed Google Scholar
Brian Barrett
View author publications
You can also search for this author in PubMed Google Scholar
Ron Brightwell
View author publications
You can also search for this author in PubMed Google Scholar
William Gropp
View author publications
You can also search for this author in PubMed Google Scholar
Vivek Kale
View author publications
You can also search for this author in PubMed Google Scholar
Rajeev Thakur
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Torsten Hoefler.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hoefler, T., Dinan, J., Buntinas, D. et al. MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory. Computing 95, 1121–1136 (2013). https://doi.org/10.1007/s00607-013-0324-2

Download citation

Received: 22 December 2012
Accepted: 25 April 2013
Published: 19 May 2013
Issue Date: December 2013
DOI: https://doi.org/10.1007/s00607-013-0324-2

Keywords

Mathematics Subject Classification

68N19 other progamming techniques (objects-oriented, sequential, concurrent, automatic, etc.)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory

Abstract

Access this article

Similar content being viewed by others

Performance improvement of the triangular matrix product in commodity clusters

Efficient High-Level Programming in Plain Java

Shared Memory Parallelism in Modern C++ and HPX

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory

Abstract

Access this article

Similar content being viewed by others

Performance improvement of the triangular matrix product in commodity clusters

Efficient High-Level Programming in Plain Java

Shared Memory Parallelism in Modern C++ and HPX

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation