MPC: A Unified Parallel Runtime for Clusters of NUMA Machines

Pérache, Marc; Jourdren, Hervé; Namyst, Raymond

doi:10.1007/978-3-540-85451-7_9

MPC: A Unified Parallel Runtime for Clusters of NUMA Machines

Marc Pérache¹,
Hervé Jourdren¹ &
Raymond Namyst²

Conference paper

887 Accesses
43 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5168))

Abstract

Over the last decade, Message Passing Interface (MPI) has become a very successful parallel programming environment for distributed memory architectures such as clusters. However, the architecture of cluster node is currently evolving from small symmetric shared memory multiprocessors towards massively multicore, Non-Uniform Memory Access (NUMA) hardware. Although regular MPI implementations are using numerous optimizations to realize zero copy cache-oblivious data transfers within shared-memory nodes, they might prevent applications from achieving most of the hardware’s performance simply because the scheduling of heavyweight processes is not flexible enough to dynamically fit the underlying hardware topology. This explains why several research efforts have investigated hybrid approaches mixing message passing between nodes and memory sharing inside nodes, such as MPI+OpenMP solutions [1,2]. However, these approaches require lots of programming efforts in order to adapt/rewrite existing MPI applications.

In this paper, we present the MultiProcessor Communications environnement (MPC), which aims at providing programmers with an efficient runtime system for their existing MPI, POSIX Thread or hybrid MPI+Thread applications. The key idea is to use user-level threads instead of processes over multiprocessor cluster nodes to increase scheduling flexibility, to better control memory allocations and optimize scheduling of the communication flows with other nodes. Most existing MPI applications can run over MPC with no modification. We obtained substantial gains (up to 20%) by using MPC instead of a regular MPI runtime on several scientific applications.

Download to read the full chapter text

Chapter PDF

References

Cappello, F., Etiemble, D.: MPI versus MPI+OpenMP on the IBM SP for the NAS benchmarks. SuperComputing (2000)
Google Scholar
Smith, L., Bull, M.: Development of mixed mode MPI/OpenMP applications. Scientific Programming (2001)
Google Scholar
Van der Steen, A.: Overview of recent supercomputers (2006)
Google Scholar
Liu, J., Chandrasekaran, B., Jiang, J., Kini, S., Yu, W., Buntinas, D., Wyckoff, P., Panda, D.: Performance comparison of MPI implementations over InfiniBand Myrinet and Quarics (2003)
Google Scholar
Hoeflinger, J.: Extending OpenMP* to clusters (2006)
Google Scholar
Lee, J., Sato, M., Boku, T.: Design and implementation of OpenMPD: An OpenMP-like programming language for distributed memory systems. In: Chapman, B.M., Zheng, W., Gao, G.R., Sato, M., Ayguadé, E., Wang, D. (eds.) IWOMP 2007. LNCS, vol. 4935. Springer, Heidelberg (2008)
Chapter Google Scholar
Smith, L., Kent, P.: Development and performances of a mixed OpenMP/MPI quantum monte carlo code. Concurrency: Practice and Experience (2000)
Google Scholar
Kalé, L.: The virtualization model of parallel programming: runtime optimizations and the state of art. In: LACSI (2002)
Google Scholar
Huang, C., Lawlor, O., V., K.: Adaptive MPI. In: Proceedings of the 16th International Workshop on Languages and Compilers for Parallel Computing (2003)
Google Scholar
Demaine, E.: A Threads-Only MPI implementation for the development of parallel programming. In: Proceedings of the 11th International Symposium on High Performance Computing Systems (1997)
Google Scholar
Pérache, M.: Contribution à l’élaboration d’environnements de programmation dédiés au calcul scientifique hautes performances. PhD thesis, Bordeaux 1 University (2006)
Google Scholar
Namyst, R.: PM2: un environnement pour une conception portable et une exécution efficace des applications parallèlles irrégulières. PhD thesis, Lille 1 university (1997)
Google Scholar
Abt, B., Desai, S., Howell, D., Perez-Gonzalet, I., McCraken, D.: Next Generation POSIX Threading Project (2002), http://www-124.ibm.com/developerworks/oss/pthread
Berger, E., McKinley, K., Blumofe, R., Wilson, P.: Hoard: a scalable memory allocator for multithreaded applications. In: International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IX) (2000)
Google Scholar
Torrellas, J., Lam, M.S., L., H.J.: False sharing and spatial locality in multiprocessor caches. IEEE Transaction on Computers (1994)
Google Scholar
Berger, E., Zorn, B., McKinley, K.: Composing high-performance memory allocators. In: Proceedings of the ACM SIGPLAN conferance on Programming Language Design and Implementation (2001)
Google Scholar
Del Pino, S., Despres, B., Have, P., Jourdren, H., Piserchia, P.F.: 3d finite volume simulation of acoustic waves in the earth atmosphere. Computer and fluids (submitted)
Google Scholar
Jourdren, H.: HERA: a hydrodynamic AMR platform for multi-physics simulations. In: Adaptive mesh refinement - theory and applications, LNCSE (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

CEA/DAM Île de France Bruyères-le-Châtel, F-91297, Arpajon Cedex
Marc Pérache & Hervé Jourdren
Laboratoire Bordelais de Recherche en Informatique 351, cours de la Libération, F-33405, Talence cedex
Raymond Namyst

Authors

Marc Pérache
View author publications
You can also search for this author in PubMed Google Scholar
Hervé Jourdren
View author publications
You can also search for this author in PubMed Google Scholar
Raymond Namyst
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Emilio Luque Tomàs Margalef Domingo Benítez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pérache, M., Jourdren, H., Namyst, R. (2008). MPC: A Unified Parallel Runtime for Clusters of NUMA Machines. In: Luque, E., Margalef, T., Benítez, D. (eds) Euro-Par 2008 – Parallel Processing. Euro-Par 2008. Lecture Notes in Computer Science, vol 5168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85451-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-540-85451-7_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85450-0
Online ISBN: 978-3-540-85451-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics