A dynamic-sized nonblocking work stealing deque
- 82 Downloads
The non-blocking work-stealing algorithm of Arora, Blumofe, and Plaxton (hencheforth ABP work-stealing) is on its way to becoming the multiprocessor load balancing technology of choice in both industry and academia. This highly efficient scheme is based on a collection of array-based double-ended queues (deques) with low cost synchronization among local and stealing processes. Unfortunately, the algorithm's synchronization protocol is strongly based on the use of fixed size arrays, which are prone to overflows, especially in the multiprogrammed environments for which they are designed. This is a significant drawback since, apart from memory inefficiency, it means that the size of the deque must be tailored to accommodate the effects of the hard-to-predict level of multiprogramming, and the implementation must include an expensive and application-specific overflow mechanism.
This paper presents the first dynamic memory work-stealing algorithm. It is based on a novel way of building non-blocking dynamic-sized work stealing deques by detecting synchronization conflicts based on “pointer-crossing” rather than “gaps between indexes” as in the original ABP algorithm. As we show, the new algorithm dramatically increases robustness and memory efficiency, while causing applications no observable performance penalty. We therefore believe it can replace array-based ABP work stealing deques, eliminating the need for application-specific overflow mechanisms.
KeywordsConcurrent programming Load balancing Work stealing Lock-free Data structures
Unable to display preview. Download preview PDF.
- 1.Lev, Y.: A Dynamic-Sized Nonblocking Work Stealing Deque. MS thesis, Tel-Aviv University, Tel-Aviv, Israel (2004)Google Scholar
- 2.Rudolph, L., Slivkin-Allalouf, M., Upfal, E.: A simple load balancing scheme for task allocation in parallel machines. In Proceedings of the 3rd Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 237–245. ACM Press (1991)Google Scholar
- 4.Acar, U.A., Blelloch, G.E., Blumofe, R.D.: The data locality of work stealing. In: ACM Symposium on Parallel Algorithms and Architectures, pp. 1–12 (2000)Google Scholar
- 5.Flood, C., Detlefs, D., Shavit, N., Zhang, C.: Parallel garbage collection for shared memory multiprocessors. In: Usenix Java Virtual Machine Research and Technology Symposium (JVM '01), Monterey, CA (2001)Google Scholar
- 6.Leiserson, P.: Programming parallel applications in cilk. SINEWS: SIAM News 31 (1998)Google Scholar
- 8.Knuth, D.: The Art of Computer Programming: Fundamental Algorithms. 2nd edn. Addison-Wesley (1968)Google Scholar
- 9.Hendler, D., Shavit, N.: Non-blocking steal-half work queues. In: Proceedings of the 21st Annual ACM Symposium on Principles of Distributed Computing (2002)Google Scholar
- 10.Detlefs, D., Flood, C., Heller, S., Printezis, T.: Garbage-first garbage collection. Technical report, Sun Microsystems – Sun Laboratories (2004) To appear.Google Scholar
- 12.Martin, P., Moir, M., Steele, G.: Dcas-based concurrent deques supporting bulk allocation. Technical Report TR-2002-111, Sun Microsystems Laboratories (2002)Google Scholar
- 13.Greenwald, M.B., Cheriton, D.R.: The synergy between non-blocking synchronization and operating system structure. In: 2nd Symposium on Operating Systems Design and Implementation, pp. 123–136. Seattle, WA (1996)Google Scholar
- 14.Blumofe, R.D., Papadopoulos, D.: The performance of work stealing in multiprogrammed environments (extended abstract). In: Measurement and Modeling of Computer Systems, pp. 266–267 (1998)Google Scholar
- 15.Arnold, J.M., Buell, D.A., Davis, E.G.: Splash 2. In: Proceedings of the Fourth Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 316–322. ACM Press (1992)Google Scholar
- 16.Papadopoulos, D.: Hood: A user-level thread library for multiprogrammed multiprocessors. In: Master's thesis, Department of Computer Sciences, University of Texas at Austin (1998)Google Scholar
- 18.Scott, M.L.: Personal communication: Code for a lock-free memory management pool (2003)Google Scholar
- 19.Hendler, D., Lev, Y., Moir, M., Shavit, N.: A dynamic-sized nonblocking work stealing deque. Technical Report TR-2005-144, Sun Microsystems Laboratories (2005)Google Scholar