Delegation Locking Libraries for Improved Performance of Multithreaded Programs

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8632)


While standard locking libraries are common and easy to use, delegation algorithms that offload work to a single thread can achieve better performance in multithreaded applications, but are hard to use without adequate library support. This paper presents an interface for delegation locks together with libraries for C and C++ that make it easy to use queue delegation locking, a versatile high-performance delegation algorithm. We show examples of using these libraries, discuss the porting effort needed to take full advantage of delegation locking in applications designed with standard locking in mind, and the improved performance that this achieves.


Critical Section Actual Execution Cache Line Cache Coherence Helper Thread 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Aronis, S., Papaspyrou, N., Roukounaki, K., Sagonas, K., Tsiouris, Y., Venetis, I.E.: A scalability benchmark suite for Erlang/OTP. In: Proceedings of the Eleventh ACM SIGPLAN Workshop on Erlang Workshop, pp. 33–42. ACM, New York (2012)CrossRefGoogle Scholar
  2. 2.
    Calciu, I., Dice, D., Lev, Y., Luchangco, V., Marathe, V.J., Shavit, N.: NUMA-aware reader-writer locks. In: Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 157–166. ACM, New York (2013)Google Scholar
  3. 3.
    Fatourou, P., Kallimanis, N.D.: Revisiting the combining synchronization technique. In: Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 257–266. ACM, New York (2012)Google Scholar
  4. 4.
    Hendler, D., Incze, I., Shavit, N., Tzafrir, M.: Flat combining and the synchronization-parallelism tradeoff. In: Proceedings of the 22nd ACM Symposium on Parallelism in Algorithms and Architectures, pp. 355–364. ACM, New York (2010)Google Scholar
  5. 5.
    Klaftenegger, D., Sagonas, K., Winblad, K.: On the scalability of the Erlang term storage. In: Proceedings of the Twelfth ACM SIGPLAN Workshop on Erlang, pp. 15–26. ACM, New York (2013)CrossRefGoogle Scholar
  6. 6.
    Klaftenegger, D., Sagonas, K., Winblad, K.: Queue delegation locking (2014),
  7. 7.
    Lozi, J.-P., David, F., Thomas, G., Lawall, J., Muller, G.: Remote core locking: Migrating critical-section execution to improve the performance of multithreaded applications. In: Proceedings of the 2012 USENIX Annual Technical Conference, Berkeley, CA, USA, pp. 65–76. USENIX Association (2012)Google Scholar
  8. 8.
    Mellor-Crummey, J.M., Scott, M.L.: Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. 9(1), 21–65 (1991)CrossRefGoogle Scholar
  9. 9.
    Oyama, Y., Taura, K., Yonezawa, A.: Executing parallel programs with synchronization bottlenecks efficiently. In: Proceedings of the International Workshop on Parallel and Distributed Computing for Symbolic and Irregular Applications, pp. 182–204. World Scientific (1999)Google Scholar
  10. 10.
    Sridharan, S., Keck, B., Murphy, R., Chandra, S., Kogge, P.: Thread migration to improve synchronization performance. In: Workshop on Operating System Interference in High Performance Applications (2006)Google Scholar
  11. 11.
    Suleman, M.A., Mutlu, O., Qureshi, M.K., Patt, Y.N.: Accelerating critical section execution with asymmetric multi-core architectures. In: Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 253–264. ACM, New York (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Information TechnologyUppsala UniversitySweden

Personalised recommendations