Abstract
The complexity of an efficient thread management steadily rises with the number of processor cores and heterogeneities in the design of system architectures, e.g., the topologies of execution units and the memory architecture. In this paper, we show that using information about the system topology combined with a hardware-aware thread management is worthwhile. We present such a hardware-aware approach that utilizes thread affinity to automatically steer the mapping of threads to cores and experimentally analyze its performance. Our experiments show that we can achieve significantly better scalability and runtime stability compared to the ordinary dispatching of threads provided by the operating system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Drepper, U., Molnar, I.: The native POSIX thread library for linux. Technical report, Red Hat, Inc. (February 2005)
Intel Corporation. Intel 64 and IA-32 Architectures Software Developer’s Manual (April 2008)
Kistler, M., Perrone, M., Petrini, F.: Cell multiprocessor communication network: Built for speed. IEEE Micro 26(3), 10–23 (2006)
Kleen, A.: A NUMA API for linux. Technical report, Novell Inc., Suse Linux Products GmbH (April 2005)
Klug, T., Ott, M., Weidendorfer, J., Trinitis, C.: autopin — Automated optimization of thread-to-core pinning on multicore systems. In: Transactions on High-Performance Embedded Architectures and Compilers. Springer, Heidelberg (2008)
Mallach, S.: Beschleunigung paralleler Standard Template Library Algorithmen. Master’s thesis, Technische Universität Dortmund (2008), http://www.informatik.uni-koeln.de/ls_juenger/people/mallach/pubs/diplomarbeit.pdf
McCalpin, J.D.: Memory bandwidth and machine balance in current high performance computers. In: IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, Dezember (1995)
McKee, S.A.: Reflections on the memory wall. In: Proceedings of the 1st Conference on Computing Frontiers (CF), p. 162. ACM Press, New York (2004)
Scogland, T., Balaji, P., Feng, W., Narayanaswamy, G.: Asymmetric interactions in symmetric multi-core systems: analysis, enhancements and evaluation. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing (SC 2008), pp. 1–12. IEEE Press, Los Alamitos (2008)
Terboven, C., Mey, D., Schmidl, D., Jin, H., Reichstein, T.: Data and thread affinity in OpenMP programs. In: Proceedings of the Workshop on Memory Access on Future Processors (MAW), pp. 377–384. ACM Press, New York (2008)
Tsigas, P., Zhang, Y.: A simple, fast parallel implementation of quicksort and its performance evaluation on sun enterprise 10000. In: Proceedings of the 11th Euromicro Conference on Parallel Distributed and Network based Processing (PDP), pp. 372–381. IEEE Press, Los Alamitos (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Mallach, S., Gutwenger, C. (2010). Improved Scalability by Using Hardware-Aware Thread Affinities. In: Keller, R., Kramer, D., Weiss, JP. (eds) Facing the Multicore-Challenge. Lecture Notes in Computer Science, vol 6310. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16233-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-16233-6_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16232-9
Online ISBN: 978-3-642-16233-6
eBook Packages: Computer ScienceComputer Science (R0)