International Journal of Parallel Programming

, Volume 43, Issue 5, pp 721–751

Queue-Based and Adaptive Lock Algorithms for Scalable Resource Allocation on Shared-Memory Multiprocessors



We present a scalable lock algorithm and an adaptive scheme for shared-memory multiprocessors addressing the resource allocation problem, which is also known as the \(h\)-out-of-\(k\) mutual exclusion problem. In this problem, threads compete for \(k\) shared resources where a thread may request an arbitrary number \(1\le h\le k\) of resources at the same time. The challenge is for each thread to acquire exclusive access to desired resources while preventing deadlock or starvation. Many existing approaches solve this problem in a distributed system, but the explicit message passing paradigm they adopt is not optimal for shared-memory. Other applicable methods, like two-phase locking and resource hierarchy, suffer from performance degradation under heavy contention, while lacking a desirable fairness guarantee. This work describes the first multi-resource lock algorithm that guarantees the strongest first-in, first-out fairness. Our methodology is based on a non-blocking queue where competing threads spin on previous conflicting resource requests. In our experimental evaluation we compared the overhead and scalability of our lock to the best available alternative approaches using a micro-benchmark. As contention increases, our multi-resource lock obtains an average of eight times speed-up over the alternatives including GNU C++’s lock method, Boost’s lock function, and Intel TBB’s queue mutex. To further improve the performance on low levels of contention, we introduce an adaptive scheme that is composed of two different lock algorithms and alternates the use the locks depending on the level of contention. Our experimental results show that the composite adaptive scheme achieves the best overall performance comparing with using either lock alone when system contention is not known a priori.


\(H\)-out-of-\(k\) mutual exclusion Lock-free programming  Queue-based lock Resource allocation 


  1. 1.
    Anderson, J.H., Kim, Y.J., Herman, T.: Shared-memory mutual exclusion: major research trends since 1986. Distrib. Comput. 16(2), 75–110 (2003)CrossRefGoogle Scholar
  2. 2.
    Anderson, Thomas E.: The performance of spin lock alternatives for shared-money multiprocessors. IEEE Trans. Parallel Distrib. Syst. 1(1), 6–16 (1990)CrossRefGoogle Scholar
  3. 3.
    Awerbuch, B., Saks, M.: A dining philosophers algorithm with polynomial response time. In: Proceedings, 31st Annual IEEE Symposium on Foundations of Computer Science, 1990, pp. 65–74. (1990)Google Scholar
  4. 4.
    Bar-Ilan, J., Peleg, D.: Distributed resource allocation algorithms. In: Segall, A., Zaks, S. (eds.) Distributed Algorithms. Lecture Notes in Computer Science, vol. 647, pp. 277–291. Springer Berlin Heidelberg (1992). doi:10.1007/3-540-56188-9_19
  5. 5.
    Bernstein, P., Goodman, N.: Timestamp based algorithms for concurrency control in distributed database systems. In: Proceedings 6th International Conference on Very Large Data Bases, (1980)Google Scholar
  6. 6.
    Boehm, H.-J., Adve, S. V.: Foundations of the c++ concurrency memory model. In: ACM SIGPLAN Notices, vol. 43, pp. 68–78. ACM, (2008)Google Scholar
  7. 7.
    Borkar, S.: Thousand core chips: a technology perspective. In: Proceedings of the 44th annual Design Automation Conference, pp. 746–749. ACM, (2007)Google Scholar
  8. 8.
    Craig, T.: Building fifo and priorityqueuing spin locks from atomic swap. Technical report, Citeseer, (1994)Google Scholar
  9. 9.
    Damron, P., Fedorova, A., Lev, Y., Luchangco, V., Moir, M., Nussbaum, D.: Hybrid transactional memory. In: ACM Sigplan Notices, vol. 41, pp. 336–346. ACM, (2006)Google Scholar
  10. 10.
    Datta, A.K., Devismes, S., Horn, F.: Self-stabilizing k-out-of-h exclusion in tree networks. Int. J. Found. Comput. Sci. 22(03), 657–677 (2011)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Dechev, D., Pirkelbauer, P., Stroustrup, B.: Lock-free dynamically resizable arrays. In: Principles of Distributed Systems, pp. 142–156. Springer, (2006)Google Scholar
  12. 12.
    Dice, D., Marathe, V.J., Shavit, N.: Flat-combining numa locks. In: Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures, pp. 65–74. ACM, (2011)Google Scholar
  13. 13.
    Dijkstra, E.W.: Hierarchical ordering of sequential processes. Acta inform. 1(2), 115–138 (1971)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Eswaran, K.P., Gray, J.N., Lorie, R.A., Traiger, I.L.: The notions of consistency and predicate locks in a database system. Commun. ACM 19(11), 624–633 (1976)MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Michael, J., Fischer, Nancy A., Lynch, James E., Burns, Allan Borodin: Distributed fifo allocation of identical resources using small shared space. ACM Trans. Program. Lang. Syst. 11(1), 90–114 (1989)CrossRefGoogle Scholar
  16. 16.
    Fischer, M.J., Lynch, N.A., Burns, J.E., Borodin, A.: Resource allocation with immunity to limited process failure. In: 20th Annual IEEE Symposium on Foundations of Computer Science, 1979, pp. 234–254. (1979)Google Scholar
  17. 17.
    Fraser, Keir, Harris, Tim: Concurrent programming without locks. ACM Trans. Comput. Syst. 25(2), 5 (2007)CrossRefGoogle Scholar
  18. 18.
    Harris, T.L., Fraser, K., Pratt, I.A.: A practical multi-word compare-and-swap operation. In: Malkhi, D. (ed.) Distributed Computing. Lecture Notes in Computer Science, vol. 2508, pp. 265–279. Springer Berlin Heidelberg (2002). doi:10.1007/3-540-36108-1_18
  19. 19.
    Herlihy, M.: A methodology for implementing highly concurrent data objects. ACM Transa. Program. Lang. Syst. 15(5), 745–770 (1993)CrossRefGoogle Scholar
  20. 20.
    Herlihy, Maurice: Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13(1), 124–149 (1991)CrossRefGoogle Scholar
  21. 21.
    Herlihy, Maurice, Moss, J.Eliot B.: Transactional memory: architectural support for lock-free data structures. SIGARCH Comput. Archit. News 21(2), 289–300 (1993)CrossRefGoogle Scholar
  22. 22.
    Herlihy, M., Shavit, N.: The Art of Multiprocessor Programming, Revised Reprint. Elsevier (2012)Google Scholar
  23. 23.
    Johnson, R., Pandis, I., Hardavellas, N., Ailamaki, A., Falsafi, B.: Shore-mt: a scalable storage manager for the multicore era. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 24–35. ACM, (2009)Google Scholar
  24. 24.
    Karlsson, B.: Beyond the C++ Standard Library: An Introduction to Boost. Pearson Education, Upper Saddle River (2005)Google Scholar
  25. 25.
    Kogan, A., Petrank, E.: A methodology for creating fast wait-free data structures. In: ACM SIGPLAN Notices, vol. 47, pp. 141–150. ACM, (2012)Google Scholar
  26. 26.
    Lomont, C.: Introduction to intel advanced vector extensions. Technical report. Intel White Paper, (2011)Google Scholar
  27. 27.
    Lynch, N.A.: Fast allocation of nearby resources in a distributed system. In: Proceedings of the twelfth annual ACM symposium on Theory of computing, pp. 70–81. ACM, (1980)Google Scholar
  28. 28.
    Marathe, V.J., Moir, M.: Toward high performance nonblocking software transactional memory. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, pp. 227–236. ACM, (2008)Google Scholar
  29. 29.
    Matveev, A., Shavit, N.: Reduced hardware transactions: a new approach to hybrid transactional memory. In: Proceedings of the 25th ACM symposium on Parallelism in algorithms and architectures, pp. 11–22. ACM, (2013)Google Scholar
  30. 30.
    Mellor-Crummey, J.M., Scott, M.L.: Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. 9(1), 21–65 (1991)CrossRefGoogle Scholar
  31. 31.
    Michael, M.M., Scott, M.L.: Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In: Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing, pp. 267–275. ACM, (1996)Google Scholar
  32. 32.
    Raynal, M.: A distributed solution to the k-out of-m resources allocation problem. In: Dehne, F., Fiala, F., Koczkodaj, W.W. (eds.) Advances in Computing and Information—ICCI’91. Lecture Notes in Computer Science, vol. 497, pp. 599–609. Springer Berlin Heidelberg (1991). doi:10.1007/3-540-54029-6_209
  33. 33.
    Raynal, M., Beeson, D.: Algorithms for Mutual Exclusion. MIT Press, Cambridge (1986)MATHGoogle Scholar
  34. 34.
    Reddy, V.A., Mittal, P., Gupta, I.: Fair k mutual exclusion algorithm for peer to peer systems. In: The 28th International Conference on Distributed Computing Systems, ICDCS’08, IEEE, pp. 655–662. (2008)Google Scholar
  35. 35.
    Rudolph, L., Segall, Z.: Dynamic decentralized cache schemes for mimd parallel processors. In: Proceedings of the 11th annual international symposium on Computer architecture, ISCA ’84, pp. 340–347. ACM, (1984)Google Scholar
  36. 36.
    Scott, M.L., Scherer, W.N.: Scalable queue-based spin locks with timeout. In: Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming, PPoPP ’01, pp. 44–52. ACM, (2001)Google Scholar
  37. 37.
    Shavit, Nir, Touitou, Dan: Software transactional memory. Distrib. Comput. 10(2), 99–116 (1997)CrossRefGoogle Scholar
  38. 38.
    Willhalm, T., Popovici, N.: Putting intel threading building blocks to work. In: Proceedings of the 1st international workshop on Multicore software engineering, pp. 3–4. ACM, (2008)Google Scholar
  39. 39.
    Yoo, R.M., Hughes, C.J., Lai, K., Rajwar, R.: Performance evaluation of intel transactional synchronization extensions for high-performance computing. In: Proceedings of SC13: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 19. ACM, (2013)Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.University of Central FloridaOrlandoUSA

Personalised recommendations