Memory Performance and SPEC OpenMP Scalability on Quad-Socket x86_64 Systems

  • Daniel Molka
  • Robert Schöne
  • Daniel Hackenberg
  • Matthias S. Müller
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7016)


Because of the continuous trend towards higher core counts, parallelization is mandatory for many application domains beyond the traditional HPC sector. Current commodity servers comprise up to 48 processor cores in configurations with only four sockets. Those shared memory systems have distinct NUMA characteristics. The exact location of data within the memory system significantly affects both access latency and bandwidth. Therefore, NUMA aware memory allocation and scheduling are highly performance relevant issues. In this paper we use low-level microbenchmarks to compare two state-of-the-art quad-socket systems with x86_64 processors from AMD and Intel. We then investigate the performance of the application based OpenMP benchmark suite SPEC OMPM2001. Our analysis shows how these benchmarks scale on shared memory systems with up to 48 cores and how scalability correlates with the previously determined characteristics of the memory hierarchy. Furthermore, we demonstrate how the processor interconnects influence the benchmark results.


Shared Memory Memory Bandwidth Access Latency Memory Latency Shared Memory System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aslot, V., Eigenmann, R.: Quantitative performance analysis of the SPEC OMPM2001 benchmarks. Sci. Program. 11, 105–124 (2003)Google Scholar
  2. 2.
    Conway, P., Kalyanasundharam, N., Donley, G., Lepak, K., Hughes, B.: Cache hierarchy and memory subsystem of the AMD Opteron processor. IEEE Micro 30, 16–29 (2010)CrossRefGoogle Scholar
  3. 3.
    Fürlinger, K., Gerndt, M., Dongarra, J.: Scalability analysis of the SPEC OpenMP benchmarks on large-scale shared memory multiprocessors. In: Shi, Y., van Albada, G., Dongarra, J., Sloot, P. (eds.) ICCS 2007. LNCS, vol. 4488, pp. 815–822. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  4. 4.
    Hackenberg, D., Molka, D., Nagel, W.E.: Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems. In: MICRO 42: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 413–422. ACM, New York (2009)Google Scholar
  5. 5.
    Intel. An Introduction to the Intel QuickPath Interconnect (January 2009)Google Scholar
  6. 6.
    Molka, D., Hackenberg, D., Schöne, R., Müller, M.S.: Memory performance and cache coherency effects on an Intel Nehalem multiprocessor system. In: PACT 2009: Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques, pp. 261–270. IEEE Computer Society, Washington, DC, USA (2009)CrossRefGoogle Scholar
  7. 7.
    Muller, M.S., Kalyanasundaram, K., Gaertner, G., Jones, W., Eigenmann, R., Lieberman, R., Van Waveren, M., Whitney, B.: SPEC HPG benchmarks for high performance systems. Int. J. High Perform. Comput. Netw. 1, 162–170 (2004)CrossRefGoogle Scholar
  8. 8.
    Saito, H., Gaertner, G., Jones, W., Eigenmann, R., Iwashita, H., Lieberman, R., van Waveren, M., Whitney, B.: Large system performance of SPEC OMP benchmark suites. International Journal of Parallel Programming 31, 197–209 (2003)CrossRefzbMATHGoogle Scholar
  9. 9.
    Treibig, J., Hager, G., Wellein, G.: Likwid: A lightweight performance-oriented tool suite for x86 multicore environments. In: International Conference on Parallel Processing Workshops, pp. 207–216 (2010)Google Scholar
  10. 10.
    Ziakas, D., Baum, A., Maddox, R.A., Safranek, R.J.: Intel® quickpath interconnect architectural features supporting scalable system architectures. In: 2010 IEEE 18th Annual Symposium on High Performance Interconnects (HOTI), pp. 1 –6 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Daniel Molka
    • 1
  • Robert Schöne
    • 1
  • Daniel Hackenberg
    • 1
  • Matthias S. Müller
    • 1
  1. 1.Center for Information Services and High Performance Computing (ZIH)Technische Universität DresdenDresdenGermany

Personalised recommendations