Performance Evaluation of Mixed-Mode OpenMP/MPI Implementations

  • J. Mark Bull
  • James Enright
  • Xu Guo
  • Chris Maynard
  • Fiona Reid


With the current prevalence of multi-core processors in HPC architectures mixed-mode programming, using both MPI and OpenMP in the same application, is seen as an important technique for achieving high levels of scalability. As there are few standard benchmarks written in this paradigm, it is difficult to assess the likely performance of such programs. To help address this, we examine the performance of mixed-mode OpenMP/MPI on a number of popular HPC architectures, using a synthetic benchmark suite and two large-scale applications. We find performance characteristics which differ significantly between implementations, and which highlight possible areas for improvement, especially when multiple OpenMP threads communicate simultaneously via MPI.


Mixed-mode MPI OpenMP Performance 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
    Bull, J.M., O’Neill, D.: A microbenchmark suite for OpenMP 2.0. In: Proceedings of the Third European Workshop on OpenMP (EWOMP’01). Barcelona, Spain (2001)Google Scholar
  3. 3.
    Bull, J.M., Enright, J., Ameer, N.: A microbenchmark suite for mixed-mode OpenMP/MPI. In: Proceedings of Fifth International Workshop on Openmp (IWOMP ’09), Dresden, Lecture Notes in Computer Science, vol. 5586. pp. 118–131. Springer (2009)Google Scholar
  4. 4.
    Edwards, R.G., Joo, B.: The chroma software system for lattice QCD. In: Proceedings of the 22nd International Symposium for Lattice Field Theory (Lattice2004), Nucl. Phys B1 40 (Proc. Suppl) p. 832 (2005)Google Scholar
  5. 5.
    Hutter J., Curioni A.: Dual-level parallelism for Ab initio molecular dynamics: reaching teraflop performance with the CPMD code. Parallel Comput. 31(1), 1–17 (2005)CrossRefMathSciNetGoogle Scholar
  6. 6.
  7. 7.
    Jin H., van der Wijngaart R.F.: Performance characteristics of the multi-zone NAS parallel benchmarks. J. Parallel Distrib. Comput. 66(5), 674–685 (2006)MATHCrossRefGoogle Scholar
  8. 8.
    McClendon, C.: Optimized Lattice QCD Kernels for a Pentium 4 Cluster, Jlab preprint, JLAB-THY-01-29.
  9. 9.
    The MIMD Lattice Computation (MILC) Collaboration.
  10. 10.
    MPI Forum, MPI: A Message-Passing Interface Standard Version 2.2 (2009)Google Scholar
  11. 11.
    OpenMP ARB, OpenMP Application Programming Interface Version 3.0 (2008)Google Scholar
  12. 12.
    Rabenseifner, R: Hybrid parallel programming on HPC platforms. In: Proceedings of the Fifth European Workshop on OpenMP, EWOMP ’03, pp. 185–194, Aachen, Germany (2003)Google Scholar
  13. 13.
    Rabenseifner, R., Hager, G., Jost, G.: Hybrid MPI/OpenMP parallel programming on clusters of multi-core SMP nodes. In: Proceedings of the 17th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP 2009) (2009)Google Scholar
  14. 14.
    Reussner R., Sanders P., Traeff J.L.: SKaMPi: a comprehensive benchmark for public benchmarking of MPI. Sci. Program. 10(1), 55–65 (2002)Google Scholar
  15. 15.
    Salmond, D., Saarinen, S.: Early experiences with the new IBM p690+ at ECMWF. In: Proceedings of the Eleventh ECMWF Workshop, pp. 1–12. World Scientific, Reading, UK (2005)Google Scholar
  16. 16.
    Smith L., Bull M.: Development of mixed mode MPI/OpenMP applications. Sci. Program. 9(2–3), 83–98 (2001)Google Scholar
  17. 17.
    The Sphinx Parallel Microbenchmark Suite.

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • J. Mark Bull
    • 1
  • James Enright
    • 1
  • Xu Guo
    • 1
  • Chris Maynard
    • 1
  • Fiona Reid
    • 1
  1. 1.EPCC, The University of EdinburghEdinburghUK

Personalised recommendations