Advertisement

An Approach to Visualize Remote Socket Traffic on the Intel Nehalem-EX

  • Christian Iwainsky
  • Thomas Reichstein
  • Christopher Dahnken
  • Dieter an Mey
  • Christian Terboven
  • Andrey Semin
  • Christian Bischof
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6586)

Abstract

The integration of the memory controller on the processor die enables ever larger core counts in commodity hardware shared memory systems with Non-Uniform Memory Architecture properties. Shared memory parallelization with OpenMP is an elegant and widely used approach to leverage the power of such systems. The binding of the OpenMP threads to compute cores and the corresponding memory association are becoming even more critical in order to obtain optimal performance. In this work we provide a method to measure the amount of remote socket memory accesses a thread generates. We use available performance monitoring CPU counters in combination with thread binding on a quad socket Nehalem EX system. For visualization of the collected data we use Vampir.

Keywords

Cache Line Remote Memory Jacobi Iteration OpenMP Thread Remote Memory Access 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Top500.org: Top 500 List June 2010 (July 2010), http://www.top500.org/
  2. 2.
    HP: HP ProLiant DL980 G7 Server Data SheetGoogle Scholar
  3. 3.
    Intel(R): Intel(r) thread checker, http://software.intel.com/en-us/intel-thread-checker/
  4. 4.
    Sun Microsystems: Thread analyzer user’s guide, http://dlc.sun.com/pdf/820-0619/820-0619.pdf
  5. 5.
    Fürlinger, K., Gerndt, M.: A profiling tool for OpenMP. In: OpenMP Shared Memory Parallel Programming, Dresden, Germany. Springer, Heidelberg (2008)Google Scholar
  6. 6.
    Terboven, C., an Mey, D., Schmidl, D., Jing, H., Wagner, M.: Data and thread affinity in OpenMP programs. In: Memory Access on future Processors: A solved problem? In: ACM International Conference on Computing Frontiers, Ischia, Italy (May 2008)Google Scholar
  7. 7.
    Jarp, S., Jurga, R., Nowak, A.: Perfmon2: A leap forward in performance monitoring. In: International Conference on Computing in High Energy and Nuclear Physics. Journal of Physics: Conference Series, vol. 119, p. 042017 (2008)Google Scholar
  8. 8.
    Intel: Intel 64 and IA-32 Architectures Optimization Reference Manual (2009)Google Scholar
  9. 9.
    Intel: Intel 64 and IA-32 Architectures Software Developer’s Manuals Volume 3B (2010)Google Scholar
  10. 10.
    Intel(R): Intel(R) Xeon(R) processor 7500 series uncore programming guide (2010), http://www.intel.com/Assets/pt_BR/PDF/designguide/323535.pdf
  11. 11.
    Mohr, B., Malony, A.D., Shende, S., Wolf, F.: Design and prototype of a performance tool interface for OpenMP. J. Supercomput. 23(1), 105–128 (2002)CrossRefzbMATHGoogle Scholar
  12. 12.
    Terpstra, D., Jagode, H., You, H., Dongarra, J.: Collecting performance data with papi-c. In: Proceedings of the 3rd Parallel Tools Workshop (2010) (to appear)Google Scholar
  13. 13.
    Knüpfer, A., Brendel, R., Brunst, H., Mix, H., Nagel, W.E.: Introducing the open trace format (OTF). In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3992, pp. 526–533. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Knüpfer, A., Brunst, H., Doleschal, J., Jurenz, M., Lieber, M., Mickler, H., Müller, M.S., Nagel, W.E.: The vampir performance analysis tool-set. In: Proceedings of the 2nd HLRS Parallel Tools Workshop, Stuttgart, Germany (July 2008)Google Scholar
  15. 15.
    Wolf, F., Wylie, B.J.N., Ábrahám, E., Becker, D., Frings, W., Fürlinger, K., Geimer, M., Hermanns, M.-A., Mohr, B., Moore, S., Pfeifer, M., Szebenyi, Z.: Usage of the scalasca toolset for scalable performance analysis of large-scale parallel applications. In: Proceedings of the 2nd HLRS Parallel Tools Workshop, Stuttgart, Germany (July 2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Christian Iwainsky
    • 1
  • Thomas Reichstein
    • 1
  • Christopher Dahnken
    • 2
  • Dieter an Mey
    • 1
  • Christian Terboven
    • 1
  • Andrey Semin
    • 2
  • Christian Bischof
    • 1
  1. 1.Center for Computing and CommunicationRWTH Aachen UniversityGermany
  2. 2.Intel GmbHFeldkirchen bei MünchenGermany

Personalised recommendations