Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures

  • Damián A. Mallón
  • Guillermo L. Taboada
  • Carlos Teijeiro
  • Juan Touriño
  • Basilio B. Fraguela
  • Andrés Gómez
  • Ramón Doallo
  • J. Carlos Mouriño
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5759)

Abstract

The current trend to multicore architectures underscores the need of parallelism. While new languages and alternatives for supporting more efficiently these systems are proposed, MPI faces this new challenge. Therefore, up-to-date performance evaluations of current options for programming multicore systems are needed. This paper evaluates MPI performance against Unified Parallel C (UPC) and OpenMP on multicore architectures. From the analysis of the results, it can be concluded that MPI is generally the best choice on multicore systems with both shared and hybrid shared/distributed memory, as it takes the highest advantage of data locality, the key factor for performance in these systems. Regarding UPC, although it exploits efficiently the data layout in memory, it suffers from remote shared memory accesses, whereas OpenMP usually lacks efficient data locality support and is restricted to shared memory systems, which limits its scalability.

Keywords

MPI UPC OpenMP Multicore Architectures Performance Evaluation NAS Parallel Benchmarks (NPB) 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    MPI Forum, http://www.mpi-forum.org (last visited: June 2009)
  2. 2.
    OpenMP, http://openmp.org (last visited: June 2009)
  3. 3.
    Unified Parallel, C., http://upc.gwu.edu (last visited: June 2009)
  4. 4.
    NAS Parallel Benchmarks, http://www.nas.nasa.gov/Resources/Software/npb.html (last visited: June 2009)
  5. 5.
    Rabenseifner, R., Hager, G., Jost, G.: Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes. In: Proc. of the 17th Euromicro Intl. Conf. on Parallel, Distributed, and Network-Based Processing (PDP 2009), Weimar (Germany), pp. 427–436 (2009)Google Scholar
  6. 6.
    Rabenseifner, R., Hager, G., Jost, G., Keller, R.: Hybrid MPI and OpenMP Parallel Programming. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, vol. 4192, p. 11. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    El-Ghazawi, T.A., Cantonnet, F., Yao, Y., Annareddy, S., Mohamed, A.S.: Productivity Analysis of the UPC Language. In: Proc. 3rd Workshop on Performance Modeling, Evaluation and Optimization of Parallel and Distributed Systems (PMEO 2004), Santa Fe (NM), pp. 1–7 (2004)Google Scholar
  8. 8.
    El-Ghazawi, T.A., Sébastien, C.: UPC Benchmarking Issues. In: Proc. 30th IEEE Intl. Conf. on Parallel Processing (ICPP 2001), Valencia (Spain), pp. 365–372 (2001)Google Scholar
  9. 9.
    El-Ghazawi, T.A., Cantonnet, F.: UPC Performance and Potential: a NPB Experimental Study. In: Proc. of the 15th ACM/IEEE Conf. on Supercomputing (SC 2002), Baltimore (MD), pp. 1–26 (2002)Google Scholar
  10. 10.
    Cantonnet, F., Yao, Y., Annareddy, S., Mohamed, A., El-Ghazawi, T.A.: Performance Monitoring and Evaluation of a UPC Implementation on a NUMA Architecture. In: Proc. of the 2nd Workshop on Performance Modeling, Evaluation and Optimization of Parallel and Distributed Systems (PMEO 2003), Nice (France), 274 (8 Pages) (2003)Google Scholar
  11. 11.
    Berkeley UPC, http://upc.lbl.gov/ (last visited: June 2009)
  12. 12.
    Mallón, D.A., Taboada, G.L., Touriño, J., Doallo, R.: NPB-MPJ: NAS Parallel Benchmarks Implementation for Message Passing in Java. In: Proc. of the 17th Euromicro Intl. Conf. on Parallel, Distributed, and Network-Based Processing (PDP 2009), Weimar (Germany), pp. 181–190 (2009)Google Scholar
  13. 13.
    El-Ghazawi, T.A., Cantonnet, F., Yao, Y., Vetter, J.: Evaluation of UPC on the Cray X1. In: Proc. of the 47th Cray User Group meeting (CUG 2005), Albuquerque (NM), 10 Pages (2005)Google Scholar
  14. 14.
    Kayi, A., Yao, Y., El-Ghazawi, T.A., Newby, G.: Experimental Evaluation of Emerging Multi-core Architectures. In: Proc. of the 6th Workshop on Performance Modeling, Evaluation and Optimization of Parallel and Distributed Systems (PMEO 2007), Long Beach (CA), pp. 1–6 (2007)Google Scholar
  15. 15.
    Curtis-Maury, M., Ding, X., Antonopoulos, C.D., Nikolopoulos, D.S.: An Evaluation of OpenMP on Current and Emerging Multithreaded/Multicore Processors. In: Mueller, M.S., Chapman, B.M., de Supinski, B.R., Malony, A.D., Voss, M. (eds.) IWOMP 2005 and IWOMP 2006. LNCS, vol. 4315, pp. 133–144. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  16. 16.
    Finis Terrae Supercomputer, http://www.top500.org/system/details/9500 (last visited: June 2009)
  17. 17.
    Taboada, G.L., Teijeiro, C., Touriño, J., Fraguela, B.B., Doallo, R., Mouriño, J.C., Mallón, D.A., Gómez, A.: Performance Evaluation of Unified Parallel C Collective Communications. In: Proc. of the 11th IEEE Intl. Conf. on High Performance Computing and Communications (HPCC 2009), Seoul (Korea), 10 Pages (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Damián A. Mallón
    • 1
  • Guillermo L. Taboada
    • 2
  • Carlos Teijeiro
    • 2
  • Juan Touriño
    • 2
  • Basilio B. Fraguela
    • 2
  • Andrés Gómez
    • 1
  • Ramón Doallo
    • 2
  • J. Carlos Mouriño
    • 1
  1. 1.Galicia Supercomputing Center (CESGA)Santiago de CompostelaSpain
  2. 2.Computer Architecture GroupUniversity of A CoruñaA CoruñaSpain

Personalised recommendations