Implementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem

  • Darius Buntinas
  • Guillaume Mercier
  • William Gropp
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4192)


This paper presents the implementation of MPICH2 over the Nemesis communication subsystem and the evaluation of its shared-memory performance. We describe design issues as well as some of the optimization techniques we employed. We conducted a performance evaluation over shared memory using microbenchmarks as well as application benchmarks. The evaluation shows that MPICH2 Nemesis has very low communication overhead, making it suitable for smaller-grained applications.


Shared Memory Message Passing Interface Instruction Count Communication Subsystem Progress Engine 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Argonne National Laboratory: MPICH2,
  2. 2.
    Buntinas, D., Mercier, G., Gropp, W.: Design and evaluation of Nemesis, a scalable low-latency message-passing communication subsystem. In: Proceedings of International Symposium on Cluster Computing and the Grid 2006 (CCGRID 2006) (2006)Google Scholar
  3. 3.
    Brown, G.: The GM message-passing system. In: Presented at the Myrinet User’s Group Conference (MUG-2002) (2002),
  4. 4.
    Buntinas, D., Mercier, G., Gropp, W.: Data transfers between processes in an SMP system: Performance study and application to MPI. In: Proceedings of the 35th International Conference on Parallel Processing (ICPP 2006) (to appear, 2006), Available at:
  5. 5.
    Burns, G., Daoud, R., Vaigl, J.: LAM: An open cluster environment for MPI. In: Proceedings of Supercomputing Symposium, pp. 379–386 (1994)Google Scholar
  6. 6.
    Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, concept, and design of a next generation MPI implementation. In: Proceedings of the 11th European PVM/MPI Users’ Group Meeting, Budapest, Hungary, pp. 97–104 (2004)Google Scholar
  7. 7.
    Myricom: (MPICH-GM),
  8. 8.
    Snell, Q.O., Mikler, A.R., Gustafson, J.L.: Netpipe: A network protocol independent performace evaluator. In: Proceedings of Internation Conference on Intelligent InformationManagement and Systems (1996)Google Scholar
  9. 9.
    Browne, S., Deane, C., Ho, G., Mucci, P.: PAPI: A portable interface to hardware performance counters. In: Proceedings of Department of Defense HPCMP Users Group Conference, Monterey, California (1999)Google Scholar
  10. 10.
  11. 11.
    Bailey, D.H., Barszcz, E., Barton, J., Browning, D., Carter, R., Dagum, L., Fatoohi, R., Fineberg, S., Frederickson, P., Lasinski, T., Schreiber, R., Simon, H., Venkatakrishnan, V., Weeratunga, S.: The NAS parallel benchmarks. Technical Report RNR-94-007, NASA Ames Research Center (1994)Google Scholar
  12. 12.
    Wong, F.C., Martin, R.P., Arpaci-Dusseau, R.H., Culler, D.E.: Architectural requirements and scalability of the NAS parallel benchmarks. In: Supercomputing 1999: Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), p. 41. ACM Press, New York (1999)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Darius Buntinas
    • 1
  • Guillaume Mercier
    • 1
  • William Gropp
    • 1
  1. 1.Mathematics and Computer Science Division, Argonne National Laboratory 

Personalised recommendations