Benchmarking Parallel Performance on Many-Core Processors

  • Bryant C. Lam
  • Ajay Barboza
  • Ravi Agrawal
  • Alan D. George
  • Herman Lam
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8356)

Abstract

With the emergence of many-core processor architectures onto the HPC scene, concerns arise regarding the performance and productivity of numerous existing parallel-programming tools, models, and languages. As these devices begin augmenting conventional distributed cluster systems in an evolving age of heterogeneous supercomputing, proper evaluation and profiling of many-core processors must occur in order to understand their performance and architectural strengths with existing parallel-programming environments and HPC applications. This paper presents and evaluates the comparative performance between two many-core processors, the Tilera TILE-Gx8036 and the Intel Xeon Phi 5110P, in the context of their applications performance with the SHMEM and OpenMP parallel-programming environments. Several applications written or provided in SHMEM and OpenMP are evaluated in order to analyze the scalability of existing tools and libraries on these many-core platforms. Our results show that SHMEM and OpenMP parallel applications scale well on the TILE-Gx and Xeon Phi, but heavily depend on optimized libraries and instrumentation.

Keywords

PGAS SHMEM OpenMP many-core parallel programming performance analysis high-performance computing parallel architectures 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bailey, D., Barszcz, E., Barton, J., Browning, D., Carter, R., Dagum, L., Fatoohi, R., Fineberg, S., Frederickson, P., Lasinski, T., Schreiber, R., Simon, H., Venkatakrishnan, V., Weeratunga, S.: The NAS Parallel Benchmarks. Tech. Rep. RNR-94-007, NASA Advanced Supercomputing Division (1994)Google Scholar
  2. 2.
    Bonachea, D.: GASNet specification, v1.1. Tech. rep., University of California at Berkeley, Berkeley, CA, USA (2002)Google Scholar
  3. 3.
    Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Computational Science Engineering 5(1), 46–55 (1998)CrossRefGoogle Scholar
  4. 4.
    Frigo, M., Johnson, S.G.: The design and implementation of FFTW3. Proceedings of the IEEE 93(2), 216–231 (2005)CrossRefGoogle Scholar
  5. 5.
    Gropp, W., Lusk, E., Doss, N., Skjellum, A.: A high-performance, portable implementation of the MPI message passing interface standard. Parallel Computing 22(6), 789–828 (1996)CrossRefMATHGoogle Scholar
  6. 6.
    Intel Corporation: Intel Xeon Phi coprocessor 5110P (2013), http://ark.intel.com/products/71992/
  7. 7.
    Lam, B.C., George, A.D., Lam, H.: TSHMEM: shared-memory parallel computing on Tilera many-core processors. In: Proc. of 18th International Workshop on High-Level Parallel Programming Models and Supportive Environments, HIPS 2013. IEEE (2013)Google Scholar
  8. 8.
    Mellanox Technologies: Mellanox ScalableSHMEM (2013), http://www.mellanox.com/related-docs/prod_software/PB_ScalableSHMEM.pdf
  9. 9.
    Silicon Graphics International Corp.: SHMEM API for parallel programming (2013), http://www.shmem.org/
  10. 10.
    Tilera Corporation: TILE-Gx8036 processor family (2013), http://www.tilera.com/products/processors/TILE-Gx_Family
  11. 11.
    University of Houston: OpenSHMEM source releases (2013), http://openshmem.org/site/Downloads/Source

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Bryant C. Lam
    • 1
  • Ajay Barboza
    • 1
  • Ravi Agrawal
    • 1
  • Alan D. George
    • 1
  • Herman Lam
    • 1
  1. 1.NSF Center for High-Performance Reconfigurable Computing (CHREC), Department of Electrical and Computer EngineeringUniversity of FloridaGainesvilleUSA

Personalised recommendations