Improved Scalability by Using Hardware-Aware Thread Affinities

Mallach, Sven; Gutwenger, Carsten

doi:10.1007/978-3-642-16233-6_6

Sven Mallach¹⁹ &
Carsten Gutwenger²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6310))

1248 Accesses
1 Citations

Abstract

The complexity of an efficient thread management steadily rises with the number of processor cores and heterogeneities in the design of system architectures, e.g., the topologies of execution units and the memory architecture. In this paper, we show that using information about the system topology combined with a hardware-aware thread management is worthwhile. We present such a hardware-aware approach that utilizes thread affinity to automatically steer the mapping of threads to cores and experimentally analyze its performance. Our experiments show that we can achieve significantly better scalability and runtime stability compared to the ordinary dispatching of threads provided by the operating system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Drepper, U., Molnar, I.: The native POSIX thread library for linux. Technical report, Red Hat, Inc. (February 2005)
Google Scholar
Intel Corporation. Intel 64 and IA-32 Architectures Software Developer’s Manual (April 2008)
Google Scholar
Kistler, M., Perrone, M., Petrini, F.: Cell multiprocessor communication network: Built for speed. IEEE Micro 26(3), 10–23 (2006)
Article Google Scholar
Kleen, A.: A NUMA API for linux. Technical report, Novell Inc., Suse Linux Products GmbH (April 2005)
Google Scholar
Klug, T., Ott, M., Weidendorfer, J., Trinitis, C.: autopin — Automated optimization of thread-to-core pinning on multicore systems. In: Transactions on High-Performance Embedded Architectures and Compilers. Springer, Heidelberg (2008)
Google Scholar
Mallach, S.: Beschleunigung paralleler Standard Template Library Algorithmen. Master’s thesis, Technische Universität Dortmund (2008), http://www.informatik.uni-koeln.de/ls_juenger/people/mallach/pubs/diplomarbeit.pdf
McCalpin, J.D.: Memory bandwidth and machine balance in current high performance computers. In: IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, Dezember (1995)
Google Scholar
McKee, S.A.: Reflections on the memory wall. In: Proceedings of the 1st Conference on Computing Frontiers (CF), p. 162. ACM Press, New York (2004)
Google Scholar
Scogland, T., Balaji, P., Feng, W., Narayanaswamy, G.: Asymmetric interactions in symmetric multi-core systems: analysis, enhancements and evaluation. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing (SC 2008), pp. 1–12. IEEE Press, Los Alamitos (2008)
Google Scholar
Terboven, C., Mey, D., Schmidl, D., Jin, H., Reichstein, T.: Data and thread affinity in OpenMP programs. In: Proceedings of the Workshop on Memory Access on Future Processors (MAW), pp. 377–384. ACM Press, New York (2008)
Google Scholar
Tsigas, P., Zhang, Y.: A simple, fast parallel implementation of quicksort and its performance evaluation on sun enterprise 10000. In: Proceedings of the 11th Euromicro Conference on Parallel Distributed and Network based Processing (PDP), pp. 372–381. IEEE Press, Los Alamitos (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Universität zu Köln, Germany
Sven Mallach
Technische Universität Dortmund, Germany
Carsten Gutwenger

Authors

Sven Mallach
View author publications
You can also search for this author in PubMed Google Scholar
Carsten Gutwenger
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

High Performance Computing Center Stuttgart (HLRS), Universität Stuttgart, Nobelstr. 19, 70569, Stuttgart, Germany
Rainer Keller
Institute of Computer Science and Engineering, Karlsruhe Institute of Technology, Haid-und-Neu-Str. 7, 76131, Karlsruhe, Germany
David Kramer
Engineering Mathematics and Computing Lab (EMCL) & Institute for Applied and Numerical Mathematics 4, Karlsruhe Institute of Technology, Fritz-Erler-Str. 23, 76133, Karlsruhe, Germany
Jan-Philipp Weiss

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mallach, S., Gutwenger, C. (2010). Improved Scalability by Using Hardware-Aware Thread Affinities. In: Keller, R., Kramer, D., Weiss, JP. (eds) Facing the Multicore-Challenge. Lecture Notes in Computer Science, vol 6310. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16233-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-16233-6_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16232-9
Online ISBN: 978-3-642-16233-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics