Towards NUMA Support with Distance Information

  • Dirk Schmidl
  • Christian Terboven
  • Dieter an Mey
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6665)


Today most multi-socket shared memory systems exhibit a non- uniform memory architecture (NUMA). However, programming models such as OpenMP do not provide explicit support for that. To overcome this limitation, we propose a platform-independent approach to describe the system topology and to place threads on the hardware. A distance matrix provides system information and is used to allow for thread binding with user-defined strategies. We propose and implement means to query this information from within the program, so that expert users can take advantage of this knowledge, and demonstrate the usefulness of our approach with an application from the Fraunhofer Institute for Laser Technology in Aachen.


Shared Memory Reduction Operation Memory Bandwidth Distance Information System Topology 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Broquedis, F., Clet-Ortega, J., Moreaud, S., Furmento, N., Goglin, B., Mercier, G., Thibault, S., Namyst, R.: hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications. In: IEEE (ed.) PDP 2010 - The 18th Euromicro International Conference on Parallel, Distributed and Network-Based Computing, Pisa Italie (Febraury 2010) Google Scholar
  2. 2.
    Terboven, C., an Mey, D., Schmidl, D., Jin, H., Wagner, M.: Data and Thread Affinity in OpenMP Programs. In: Proceedings of the 2008 workshop on Memory Access on Future Processors: a Solved Problem? MAW 2008, pp. 377–384. ACM, New York (2008)CrossRefGoogle Scholar
  3. 3.
    Schmidl, D., Terboven, C., an Mey, D., Wolf, A., Bischof, C.: How to scale Nested OpenMP Applications on the ScaleMP vSMP Architecture. In: Proceedings of the IEEE International Conference on Cluster Computing CLUSTER 2010, Heraklion, Greece, pp. 29–37 (September 2010)Google Scholar
  4. 4.
    Schmidl, D., Terboven, C., an Mey, D., Bücker, M.: Binding Nested OpenMP Programs on Hierarchical Memory Architectures. In: Sato, M., Hanawa, T., Müller, M.S., Chapman, B.M., de Supinski, B.R. (eds.) IWOMP 2010. LNCS, vol. 6132, pp. 29–42. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  5. 5.
    Nikolopoulos, D.S., Papatheodorou, T.S., Polychronopoulos, C.D., Labarta, J., Ayguade, E.: Is Data Distribution Necessary in OpenMP? In: Proceedings of the 2000 ACM/IEEE Conference on Supercomputing (CDROM), Supercomputing 2000. IEEE Computer Society, Washington, DC (2000)Google Scholar
  6. 6.
    Hewlett-Packard, Intel, Microsoft, Phoenix, and Toshiba. Advanced configuration and power interface (January 2011),
  7. 7.
    Norden, M., Löf, H., Rantakokko, J., Holmgren, S.: Dynamic Data Migration for Structured AMR Solvers. Int. J. Parallel Program. 35, 477–491 (2007)CrossRefzbMATHGoogle Scholar
  8. 8.
    OpenMP ARB. OpenMP Application Program Interface, v. 3.0 (May 2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Dirk Schmidl
    • 1
  • Christian Terboven
    • 1
  • Dieter an Mey
    • 1
  1. 1.Center for Computing and CommunicationRWTH Aachen UniversityGermany

Personalised recommendations