A Work Stealing Scheduler for Parallel Loops on Shared Cache Multicores

Tchiboukdjian, Marc; Danjean, Vincent; Gautier, Thierry; Le Mentec, Fabien; Raffin, Bruno

doi:10.1007/978-3-642-21878-1_13

Marc Tchiboukdjian²⁶,
Vincent Danjean²⁶,
Thierry Gautier²⁶,
Fabien Le Mentec²⁶ &
…
Bruno Raffin²⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6586))

Included in the following conference series:

European Conference on Parallel Processing

1617 Accesses
7 Citations
6 Altmetric

Abstract

Reordering instructions and data layout can bring significant performance improvement for memory bounded applications. Parallelizing such applications requires a careful design of the algorithm in order to keep the locality of the sequential execution. In this paper, we aim at finding a good parallelization of memory bounded applications on multicore that preserves the advantage of a shared cache. We focus on sequential applications with iteration through a sequence of memory references. Our solution relies on a work stealing scheduler combined with a dynamic sliding window that constrains cores sharing the same cache to process data close in memory. This parallel algorithm induces the same number of cache misses as the sequential algorithm at the expense of an increased number of synchronizations. Experiments with a memory bounded application confirm that core collaboration for shared cache access can bring significant performance improvements despite the incurred synchronization costs.

Download to read the full chapter text

Chapter PDF

Staccato: Cache-Aware Work-Stealing Task Scheduler for Shared-Memory Systems

An Efficient OpenMP Loop Scheduler for Irregular Applications on Large-Scale NUMA Machines

A Sharing-Aware Memory Management Unit for Online Mapping in Multi-core Architectures

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Cascaval, C., Padua, D.A.: Estimating cache misses and locality using stack distances. In: Proc. of ICS (2003)
Google Scholar
Gautier, T., Besseron, X., Pigeon, L.: KAAPI: A thread scheduling runtime system for data flow computations on cluster of multi-processors. In: PASCO (2007)
Google Scholar
Traoré, D., Roch, J.L., Maillard, N., Gautier, T., Bernard, J.: Deque-free work-optimal parallel STL algorithms. In: Luque, E., Margalef, T., Benítez, D. (eds.) Euro-Par 2008. LNCS, vol. 5168, pp. 887–897. Springer, Heidelberg (2008)
Chapter Google Scholar
Schroeder, W., Martin, K., Lorensen, B.: The Visualization Toolkit, An Object-Oriented Approach To 3D Graphics, 3rd edn. Kitware Inc. (2004)
Google Scholar
Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. In: The International Journal of High Performance Computing Applications, vol. 14 (2000)
Google Scholar
Zhang, E.Z., Jiang, Y., Shen, X.: Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs? In: PPoPP (2010)
Google Scholar
Jaleel, A., Mattina, M., Jacob, B.: Last level cache (LLC) performance of data mining workloads on a CMP. In: HPCA (2006)
Google Scholar
Zhang, H., Newman, T.S., Zhang, X.: Case study of multithreaded in-core isosurface extraction algorithms. In: EGPGV (2004)
Google Scholar
Tchiboukdjian, M., Danjean, V., Raffin, B.: Binary mesh partitioning for cache-efficient visualization. TVCG 16(5), 815–828 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

MOAIS Project, INRIA- LIG, France
Marc Tchiboukdjian, Vincent Danjean, Thierry Gautier, Fabien Le Mentec & Bruno Raffin

Authors

Marc Tchiboukdjian
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Danjean
View author publications
You can also search for this author in PubMed Google Scholar
Thierry Gautier
View author publications
You can also search for this author in PubMed Google Scholar
Fabien Le Mentec
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Raffin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CNR, ICAR, Via P. Castellino, 111, 80131, Napoli, Italy
Mario R. Guarracino
INRIA, PIP ENS Lyon, 46 Allée d’Italie, 69364, Lyon, France
Frédéric Vivien
Scientific Computing, University of Vienna, Nordbergstr. 15/3C, 1090, Vienna, Austria
Jesper Larsson Träff
University of Catanzaro, 88100, Catanzaro, Italy
Mario Cannatoro
Dept. of Computer Science, University of Pisa, Via Tevere 17, 56122, Pisa, Italy
Marco Danelutto
Gavle Creative Media Lab, Kungsbacksvagen 47, 80632, Gavle, Sweden
Anders Hast
Dept. Math & Stat, University of Naples Parthenope, via Medina 40, 80133, Napoli, Italy
Francesca Perla
TU Dresden, Zellescher Weg 12-14, 01187, Dresden, Germany
Andreas Knüpfer
Dipartimento di Ingegneria dell’ Informazione, Seconda Università di Napoli, via Roma 29, 81031, Aversa, Italy
Beniamino Di Martino
Scaledinfra technologies GmbH, Köllnerhofgasse 3/15A, 1010, Vienna, Austria
Michael Alexander

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tchiboukdjian, M., Danjean, V., Gautier, T., Le Mentec, F., Raffin, B. (2011). A Work Stealing Scheduler for Parallel Loops on Shared Cache Multicores. In: Guarracino, M.R., et al. Euro-Par 2010 Parallel Processing Workshops. Euro-Par 2010. Lecture Notes in Computer Science, vol 6586. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21878-1_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-21878-1_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21877-4
Online ISBN: 978-3-642-21878-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Work Stealing Scheduler for Parallel Loops on Shared Cache Multicores

Abstract

Chapter PDF

Similar content being viewed by others

Staccato: Cache-Aware Work-Stealing Task Scheduler for Shared-Memory Systems

An Efficient OpenMP Loop Scheduler for Irregular Applications on Large-Scale NUMA Machines

A Sharing-Aware Memory Management Unit for Online Mapping in Multi-core Architectures

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Work Stealing Scheduler for Parallel Loops on Shared Cache Multicores

Abstract

Chapter PDF

Similar content being viewed by others

Staccato: Cache-Aware Work-Stealing Task Scheduler for Shared-Memory Systems

An Efficient OpenMP Loop Scheduler for Irregular Applications on Large-Scale NUMA Machines

A Sharing-Aware Memory Management Unit for Online Mapping in Multi-core Architectures

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation