Cooperative Scheduling of Parallel Tasks with General Synchronization Patterns

  • Shams Imam
  • Vivek Sarkar
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8586)


In this paper, we address the problem of scheduling parallel tasks with general synchronization patterns using a cooperative runtime. Current implementations for task-parallel programming models provide efficient support for fork-join parallelism, but are unable to efficiently support more general synchronization patterns such as locks, futures, barriers and phasers. We propose a novel approach to addressing this challenge based on cooperative scheduling with one-shot delimited continuations (OSDeConts) and event-driven controls (EDCs). The use of OSDeConts enables the runtime to suspend a task at any point (thereby enabling the task’s worker to switch to another task) whereas other runtimes may have forced the task’s worker to be blocked. The use of EDCs ensures that identification of suspended tasks that are ready to be resumed can be performed efficiently. Furthermore, our approach is more efficient than schedulers that spawn additional worker threads to compensate for blocked worker threads.

We have implemented our cooperative runtime in Habanero-Java (HJ), an explicitly parallel language with a large variety of synchronization patterns. The OSDeConts and EDC primitives are used to implement a wide range of synchronization constructs, including those where a task may trigger the enablement of multiple suspended tasks (as in futures, barriers and phasers). In contrast, current task-parallel runtimes and schedulers for the fork-join model (including schedulers for the Cilk language) focus on the case where only one continuation is enabled by an event (typically, the termination of the last child/descendant task in a join scope). Our experimental results show that the HJ cooperative runtime delivers significant improvements in performance and memory utilization on various benchmarks using future and phaser constructs, relative to a thread-blocking runtime system while using the same underlying work-stealing task scheduler.


Task Parallelism Cooperative Scheduling Delimited Continuations Async-Finish Parallelism Habanero-Java 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
    Blumofe, R.D., Joerg, C.F., Kuszmaul, B.C., Leiserson, C.E., Randall, K.H., Zhou, Y.: Cilk: An Efficient Multithreaded Runtime System. In: Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 1995, pp. 207–216. ACM, New York (1995)Google Scholar
  3. 3.
    Cavé, V., Zhao, J., Guo, Y., Sarkar, V.: Habanero-Java: the New Adventures of Old X10. In: PPPJ, pp. 51–61 (2011)Google Scholar
  4. 4.
    Chamberlain, B.L., Callahan, D., Zima, H.P.: Parallel Programmability and the Chapel Language. International Journal of High Performance Computing Applications 21(3), 291–312 (2007)CrossRefGoogle Scholar
  5. 5.
    Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., Sarkar, V.: X10: An Object-Oriented Approach to Non-uniform Cluster Computing. SIGPLAN Not. 40, 519–538 (2005)CrossRefGoogle Scholar
  6. 6.
    Drago, I., Cunei, A., Vitek, J.: Continuations in the Java Virtual Machine. In: International Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems (2007)Google Scholar
  7. 7.
  8. 8.
    Felleisen, M.: The Theory and Practice of First-Class Prompts. In: Proceedings of the 15th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 1988, pp. 180–190. ACM, New York (1988)CrossRefGoogle Scholar
  9. 9.
    Fischer, J., Majumdar, R., Millstein, T.: Tasks: Language Support for Event-driven Programming. In: Proceedings of the 2007 ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation, PEPM 2007, ACM, New York (2007)Google Scholar
  10. 10.
    Fluet, M., Rainey, M., Reppy, J., Shaw, A.: Implicitly Threaded Parallelism in Manticore. J. Funct. Program. 20(5-6) (November 2010)Google Scholar
  11. 11.
  12. 12.
  13. 13.
    Georges, A., Buytaert, D., Eeckhout, L.: Statistically Rigorous Java Performance Evaluation. In: Proceedings of the 22nd Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems and Applications, OOPSLA 2007, pp. 57–76. ACM, New York (2007)CrossRefGoogle Scholar
  14. 14.
    Gray, J.: Writing Faster Managed Code: Know What Things Cost,
  15. 15.
    Guo, Y., Barik, R., Raman, R., Sarkar, V.: Work-First and Help-First Scheduling Policies for Async-Finish Task Parallelism. In: Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, IPDPS 2009, pp. 1–12. IEEE Computer Society, Washington, DC (2009)CrossRefGoogle Scholar
  16. 16.
    Gupta, S., Nandivada, V.K.: IMSuite: A Benchmark Suite for Simulating Distributed Algorithms. CoRR abs/1310.2814 (2013)Google Scholar
  17. 17.
    Halstead, R.H.: Multilisp: A Language for Concurrent Symbolic Computation. ACM Transactions on Programming Languages and Systems 7, 501–538 (1985)CrossRefzbMATHGoogle Scholar
  18. 18.
    Haynes, C.T., Friedman, D.P.: Engines Build Process Abstractions. In: Proceedings of the 1984 ACM Symposium on LISP and Functional Programming, LFP 1984, pp. 18–24. ACM, New York (1984)CrossRefGoogle Scholar
  19. 19.
    Herzeel, C., Costanza, P.: Dynamic Parallelization of Recursive Code Part I: Managing Control Flow Interactions with the Continuator. In: Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA 2010, pp. 377–396. ACM, New York (2010)Google Scholar
  20. 20.
    Imam, S., Sarkar, V.: Integrating Task Parallelism with Actors. In: Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA 2012, pp. 753–772. ACM, New York (2012), CrossRefGoogle Scholar
  21. 21.
    Imam, S., Sarkar, V.: A Case for Cooperative Scheduling in X10’s Managed Runtime. In: The 2014 X10 Workshop (X10 2014) (June 2014)Google Scholar
  22. 22.
    Lea, D.: A Java Fork/Join Framework. In: Java Grande, pp. 36–43 (2000)Google Scholar
  23. 23.
    Li, P., Marlow, S., Peyton Jones, S., Tolmach, A.: Lightweight Concurrency Primitives for GHC. In: Proceedings of the ACM SIGPLAN Haskell Workshop, Haskell 2007, pp. 107–118. ACM, New York (2007)Google Scholar
  24. 24.
    OpenMP Application Program Interface, Version 3.0 (May 2008),
  25. 25.
    Reinders, J.: Intel Threading Building Blocks, 1st edn. O’Reilly & Associates, Inc., Sebastopol (2007)Google Scholar
  26. 26.
    Shirako, J., Peixotto, D.M., Sarkar, V., Scherer, W.N.: Phasers: a Unified Deadlock-Free Construct for Collective and Point-to-Point Synchronization. In: Proceedings of the 22nd Annual International Conference on Supercomputing, ICS 2008, pp. 277–288. ACM, New York (2008)Google Scholar
  27. 27.
    Sigoure, B.: How long does it take to make a context switch,
  28. 28.
    Srinivasan, S., Mycroft, A.: Kilim: Isolation-Typed Actors for Java. In: Vitek, J. (ed.) ECOOP 2008. LNCS, vol. 5142, pp. 104–128. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  29. 29.
    Tardieu, O., Wang, H., Lin, H.: A Work-Stealing Scheduler for X10s Task Parallelism with Suspension. In: Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2012, pp. 267–276. ACM, New York (2012)Google Scholar
  30. 30.
    Wheeler, K., Murphy, R., Thain, D.: Qthreads: An API for programming with millions of lightweight threads. In: IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2008, pp. 1–8 (2008)Google Scholar
  31. 31.
    Yan, Y., Chatterjee, S., Budimlic, Z., Sarkar, V.: Integrating MPI with Asynchronous Task Parallelism. In: Cotronis, Y., Danalis, A., Nikolopoulos, D.S., Dongarra, J. (eds.) EuroMPI 2011. LNCS, vol. 6960, pp. 333–336. Springer, Heidelberg (2011), CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Shams Imam
    • 1
  • Vivek Sarkar
    • 1
  1. 1.Department of Computer ScienceRice UniversityUSA

Personalised recommendations