Allocating Memory in a Lock-Free Manner

  • Anders Gidenstam
  • Marina Papatriantafilou
  • Philippas Tsigas
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3669)


The potential of multiprocessor systems is often not fully realized by their system services. Certain synchronization methods, such as lock-based ones, may limit the parallelism. It is significant to see the impact of wait/lock-free synchronization design in key services for multiprocessor systems, such as the memory allocation service. Efficient, scalable memory allocators for multithreaded applications on multiprocessors is a significant goal of recent research projects.

We propose a lock-free memory allocator, to enhance the parallelism in the system. Its architecture is inspired by Hoard, a successful concurrent memory allocator, with a modular, scalable design that preserves scalability and helps avoiding false-sharing and heap blowup. Within our e.ort on designing appropriate lock-free algorithms to construct this system, we propose a new non-blocking data structure called flat-sets, supporting conventional “internal” operations as well as “inter-object” operations, for moving items between flat-sets.

We implemented the memory allocator in a set of multiprocessor systems (UMA Sun Enterprise 450 and ccNUMA Origin 3800) and studied its behaviour. The results show that the good properties of Hoard w.r.t. false-sharing and heap-blowup are preserved, while the scalability properties are enhanced even further with the help of lock-free synchronization.


Multiprocessor System Shared Location Move Operation Memory Request False Sharing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Berger, E.D.: Memory Management for High-Performance Applications. PhD thesis, The University of Texas at Austin, Department of Computer Sciences (2002)Google Scholar
  2. 2.
    Berger, E., McKinley, K., Blumofe, R., Wilson, P.: Hoard: A scalable memory allocator for multithreaded applications. In: ASPLOS-IX: 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 117–128 (2000)Google Scholar
  3. 3.
    Barnes, G.: A method for implementing lock-free shared data structures. In: Proc. of the 5th Annual ACM Symp. on Parallel Algorithms and Architectures, SIGACT and SIGARCH, pp. 261–270 (1993) (Extended abstract)Google Scholar
  4. 4.
    Greenwald, M., Cheriton, D.R.: The synergy between non-blocking synchronization and operating system structure. In: Operating Systems Design and Implementation, pp. 123–136 (1996)Google Scholar
  5. 5.
    Herlihy, M.: Wait-free synchronization. ACM Transaction on Programming and Systems 11, 124–149 (1991)CrossRefGoogle Scholar
  6. 6.
    Rinard, M.C.: Effective fine-grain synchronization for automatically parallelized programs using optimistic synchronization primitives. ACM Transactions on Computer Systems 17, 337–371 (1999)CrossRefGoogle Scholar
  7. 7.
    Dice, D., Garthwaite, A.: Mostly lock-free malloc. In: ISMM 2002 Proc. of the 3rd Int. Symp. on Memory Management. ACM SIGPLAN Notices, pp. 163–174. ACM Press, New York (2002)CrossRefGoogle Scholar
  8. 8.
    Massalin, H., Pu, C.: A lock-free multiprocessor OS kernel. Technical Report CUCS-005-91 (1991)Google Scholar
  9. 9.
    Herlihy, M.P., Wing, J.M.: Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems 12, 463–492 (1990)CrossRefGoogle Scholar
  10. 10.
    Hoepman, J.H., Papatriantafilou, M., Tsigas, P.: Self-stabilization of wait-free shared memory objects. Journal of Parallel and Distributed Computing 62, 766–791 (2002)CrossRefGoogle Scholar
  11. 11.
    Tsigas, P., Zhang, Y.: Evaluating the performance of non-blocking synchronisation on shared-memory multiprocessors. In: Proc. of the ACM SIGMETRICS 2001/Performance 2001, pp. 320–321. ACM Press, New York (2001)Google Scholar
  12. 12.
    Tsigas, P., Zhang, Y.: Integrating non-blocking synchronisation in parallel applications: Performance advantages and methodologies. In: Proc. of the 3rd ACM Workshop on Software and Performance (WOSP 2002), pp. 55–67. ACM Press, New York (2002)CrossRefGoogle Scholar
  13. 13.
    Sundell, H., Tsigas, P.: NOBLE: A non-blocking inter-process communication library. In: Proc. of the 6th Workshop on Languages, Compilers and Run-time Systems for Scalable Computers. LNCS. Springer, Heidelberg (2002)Google Scholar
  14. 14.
    Valois, J.D.: Lock-free linked lists using compare-and-swap. In: Proc. of the 14th Annual ACM Symp. on Principles of Distributed Computing (PODC 1995), pp. 214–222. ACM, New York (1995)CrossRefGoogle Scholar
  15. 15.
    Tsigas, P., Zhang, Y.: A simple, fast and scalable non-blocking concurrent fifo queue for shared memory multiprocessor systems. In: Proc. of the 13th annual ACM symp. on Parallel algorithms and architectures, pp. 134–143. ACM Press, New York (2001)Google Scholar
  16. 16.
    Michael, M.M.: Safe memory reclamation for dynamic lock-free objects using atomic reads and writes. In: Proc. of the 21st annual symp. on Principles of distributed computing, pp. 21–30. ACM Press, New York (2002)CrossRefGoogle Scholar
  17. 17.
    Harris, T.L.: A pragmatic implementation of non-blocking linked lists. In: Welch, J.L. (ed.) DISC 2001. LNCS, vol. 2180, pp. 300–314. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  18. 18.
    Michael, M.M.: High performance dynamic lock-free hash tables and list-based sets. In: Proc. of the 14th Annual ACM Symp. on Parallel Algorithms and Architectures (SPAA 2002), pp. 73–82. ACM Press, New York (2002)Google Scholar
  19. 19.
    Michael, M.M.: Practical lock-free and wait-free LL/SC/VL implementations using 64-bit CAS. In: Guerraoui, R. (ed.) DISC 2004. LNCS, vol. 3274, pp. 144–158. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  20. 20.
    Gidenstam, A., Papatriantafilou, M., Tsigas, P.: Allocating memory in a lock-free manner. Technical Report 2004-04, Computing Science, Chalmers University of technology (2004)Google Scholar
  21. 21.
    IBM: IBM System 370 Extended Architecture, Principles of Operation (1983); Publication No. SA22-7085Google Scholar
  22. 22.
    Larson, P.P.-Å., Krishnan, M.: Memory allocation for long-running server applications. In: ISMM 1998 Proc. of the 1st Int. Symp. on Memory Management. ACM SIGPLAN Notices, pp. 176–185. ACM Press, New York (1998)CrossRefGoogle Scholar
  23. 23.
    Michael, M.: Scalable lock-free dynamic memory allocation. In: Proc. of SIGPLAN 2004 Conf. on Programming Languages Design and Implementation. ACM SIGPLAN Notices. ACM Press, New York (2004)Google Scholar
  24. 24.
    Massalin, H.: Synthesis: An Efficient Implementation of Fundamental Operating System Services. PhD thesis, Columbia University (1992)Google Scholar
  25. 25.
    Greenwald, M.B.: Non-blocking synchronization and system design. PhD thesis, Stanford University (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Anders Gidenstam
    • 1
  • Marina Papatriantafilou
    • 1
  • Philippas Tsigas
    • 1
  1. 1.Department of Computer Science and EngineeringChalmers University of TechnologyGöteborgSweden

Personalised recommendations