, Volume 58, Issue 2, pp 304–338 | Cite as

NBmalloc: Allocating Memory in a Lock-Free Manner

  • Anders GidenstamEmail author
  • Marina Papatriantafilou
  • Philippas Tsigas
Open Access


Efficient, scalable memory allocation for multithreaded applications on multiprocessors is a significant goal of recent research. In the distributed computing literature it has been emphasized that lock-based synchronization and concurrency-control may limit the parallelism in multiprocessor systems. Thus, system services that employ such methods can hinder reaching the full potential of these systems. A natural research question is the pertinence and the impact of lock-free concurrency control in key services for multiprocessors, such as in the memory allocation service, which is the theme of this work. We show the design and implementation of NBmalloc, a lock-free memory allocator designed to enhance the parallelism in the system. The architecture of NBmalloc is inspired by Hoard, a well-known concurrent memory allocator, with modular design that preserves scalability and helps avoiding false-sharing and heap-blowup. Within our effort to design appropriate lock-free algorithms for NBmalloc, we propose and show a lock-free implementation of a new data structure, flat-set, supporting conventional “internal” set operations as well as “inter-object” operations, for moving items between flat-sets. The design of NBmalloc also involved a series of other algorithmic problems, which are discussed in the paper. Further, we present the implementation of NBmalloc and a study of its behaviour in a set of multiprocessor systems. The results show that the good properties of Hoard w.r.t. false-sharing and heap-blowup are preserved.


Memory allocation Lock-free synchronization Non-blocking synchronization Multithreaded applications 


  1. 1.
    Gidenstam, A., Papatriantafilou, M., Tsigas, P.: Allocating memory in a lock-free manner. In: Proc. of the 13th Annual European Symp. on Algorithms (ESA’05). LNCS, vol. 3669, pp. 329–342. Springer, Berlin (2005) Google Scholar
  2. 2.
    Berger, E.D.: Memory management for high-performance applications. Ph.D. Thesis, The University of Texas at Austin, Department of Computer Sciences (2002) Google Scholar
  3. 3.
    Berger, E., McKinley, K., Blumofe, R., Wilson, P.: Hoard: A scalable memory allocator for multithreaded applications. In: ASPLOS-IX: 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 117–128 (2000) Google Scholar
  4. 4.
    Michael, M.: Scalable lock-free dynamic memory allocation. In: Proc. of SIGPLAN 2004 Conf. on Programming Languages Design and Implementation. ACM SIGPLAN Notices. ACM, New York (2004) Google Scholar
  5. 5.
    Dice, D., Garthwaite, A.: Mostly lock-free malloc. In: ISMM’02 Proc. of the 3rd Int. Symp. on Memory Management. ACM SIGPLAN Notices, pp. 163–174. ACM, New York (2002) Google Scholar
  6. 6.
    Michael, M.M.: Safe memory reclamation for dynamic lock-free objects using atomic reads and writes. In: Proceedings of the 21st ACM Symposium on Principles of Distributed Computing, pp. 21–30. ACM, New York (2002) Google Scholar
  7. 7.
    Michael, M.M.: Hazard pointers: Safe memory reclamation for lock-free objects. IEEE Trans. Parallel Distrib. Syst. 15, 491–504 (2004) CrossRefGoogle Scholar
  8. 8.
    Herlihy, M., Luchangco, V., Moir, M.: The repeat offender problem: A mechanism for supporting dynamic-sized, lock-free data structures. In: Proceedings of the 16th International Symposium on Distributed Computing (DISC’02). LNCS, vol. 2508, pp. 339–353. Springer, Berlin (2002) Google Scholar
  9. 9.
    Valois, J.D.: Lock-free data structures. Ph.D. Thesis, Rensselaer Polytechnic Institute, Department of Computer Science, Troy, New York (1995) Google Scholar
  10. 10.
    Michael, M.M., Scott, M.L.: Correction of a memory management method for lock-free data structures. Technical Report TR599, University of Rochester, Computer Science Department (1995) Google Scholar
  11. 11.
    Detlefs, D.L., Martin, P.A., Moir, M., Guy L. Steele, J.: Lock-free reference counting. In: Proceedings of the 20th Annual ACM Symposium on Principles of Distributed Computing, pp. 190–199. ACM, New York (2001) Google Scholar
  12. 12.
    Herlihy, M., Luchangco, V., Martin, P., Moir, M.: Nonblocking memory management support for dynamic-sized data structures. ACM Trans. Comput. Syst. 23, 146–196 (2005) CrossRefGoogle Scholar
  13. 13.
    Gidenstam, A., Papatriantafilou, M., Sundell, H., Tsigas, P.: Practical and efficient lock-free garbage collection based on reference counting. In: Proc. of the 8th International Symp. on Parallel Architectures, Algorithms, and Networks (I-SPAN), pp. 202–207. IEEE Comput. Soc., Los Alamitos (2005) Google Scholar
  14. 14.
    Schneider, S., Antonopoulos, C., Nikolopoulos, D.: Scalabel locality-conscious multithreaded memory allocation. In: Proceedings of the 2006 International Symposium on Memory Management (ISMM’06), pp. 84–94. ACM, New York (2006) CrossRefGoogle Scholar
  15. 15.
    Massalin, H., Pu, C.: A lock-free multiprocessor OS kernel. Technical Report CUCS-005-91 (1991) Google Scholar
  16. 16.
    Massalin, H.: Synthesis: an efficient implementation of fundamental operating system services. Ph.D. Thesis, Columbia University (1992) Google Scholar
  17. 17.
    Greenwald, M., Cheriton, D.R.: The synergy between non-blocking synchronization and operating system structure. In: Operating Systems Design and Implementation, pp. 123–136 (1996) Google Scholar
  18. 18.
    Greenwald, M.B.: Non-blocking synchronization and system design. Ph.D. Thesis, Stanford University (1999) Google Scholar
  19. 19.
    SGI: The standard template library for C++ (2003).
  20. 20.
    Blumofe, R.D., Leiserson, C.E.: Scheduling multithreaded computations by work stealing. J. Assoc. Comput. Mach. 46, 720–748 (1999) zbMATHMathSciNetGoogle Scholar
  21. 21.
    Gloger, W.: Wolfram Gloger’s malloc homepage (2003).
  22. 22.
    Steensgaard, B.: Thread-specific heaps for multi-threaded programs. In: ISMM 2000 Proc. of the Second Int. Symp. on Memory Management. ACM SIGPLAN Notices, vol. 36(1). ACM, New York (2000) Google Scholar
  23. 23.
    Herlihy, M.P., Wing, J.M.: Linearizability: A correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12, 463–492 (1990) CrossRefGoogle Scholar
  24. 24.
    Barnes, G.: A method for implementing lock-free shared data structures. In: Proc. of the 5th Annual ACM Symp. on Parallel Algorithms and Architectures, SIGACT and SIGARCH, pp. 261–270 (1993). Extended abstract Google Scholar
  25. 25.
    Herlihy, M.: Wait-free synchronization. ACM Trans. Program. Syst. 11, 124–149 (1991) CrossRefGoogle Scholar
  26. 26.
    Rinard, M.C.: Effective fine-grain synchronization for automatically parallelized programs using optimistic synchronization primitives. ACM Trans. Comput. Syst. 17, 337–371 (1999) CrossRefGoogle Scholar
  27. 27.
    Valois, J.D.: Lock-free linked lists using compare-and-swap. In: Proc. of the 14th Annual ACM Symp. on Principles of Distributed Computing (PODC ’95), pp. 214–222. ACM, New York (1995) CrossRefGoogle Scholar
  28. 28.
    Tsigas, P., Zhang, Y.: A simple, fast and scalable non-blocking concurrent fifo queue for shared memory multiprocessor systems. In: Proc. of the 13th Annual ACM Symp. on Parallel Algorithms and Architectures, pp. 134–143. ACM, New York (2001) Google Scholar
  29. 29.
    Harris, T.L.: A pragmatic implementation of non-blocking linked lists. In: Proc. of the 15th Int. Conf. on Distributed Computing, pp. 300–314. Springer, Berlin (2001) Google Scholar
  30. 30.
    Hoepman, J.H., Papatriantafilou, M., Tsigas, P.: Self-stabilization of wait-free shared memory objects. J. Parallel Distrib. Comput. 62, 766–791 (2002) CrossRefGoogle Scholar
  31. 31.
    Moir, M.: Practical implementations of non-blocking synchronization primitives. In: Proc. of the 16th Annual ACM Symp. on Principles of Distributed Computing, pp. 219–228 (1997) Google Scholar
  32. 32.
    Papatriantafilou, M., Tsigas, P.: Wait-free consensus in “in-phase” multiprocessor systems. In: Symp. on Parallel and Distributed Processing (SPDP ’95), pp. 312–319. IEEE Comput. Soc., Los Alamitos (1995) Google Scholar
  33. 33.
    Papatriantafilou, M., Tsigas, P.: On self-stabilizing wait-free clock synchronization. Parallel Process. Lett. 7, 321–328 (1997) CrossRefMathSciNetGoogle Scholar
  34. 34.
    Sundell, H., Tsigas, P.: Fast and lock-free concurrent priority queues for multi-thread systems. In: Proc. of the 17th IEEE/ACM Int. Parallel and Distributed Processing Symp. (IPDPS 03). IEEE Press, New York (2003) Google Scholar
  35. 35.
    Valois, J.D.: Implementing lock-free queues. In: Proc. of the Seventh Int. Conf. on Parallel and Distributed Computing Systems, pp. 64–69 (1994) Google Scholar
  36. 36.
    Michael, M.M.: High performance dynamic lock-free hash tables and list-based sets. In: Proc. of the 14th Annual ACM Symp. on Parallel Algorithms and Architectures (SPAA-02), pp. 73–82. ACM, New York (2002) Google Scholar
  37. 37.
    Jayanti, P.: A complete and constant time wait-free implementation of CAS from LL/SC and vice versa. In: Proceedings of the 12th International Symposium on Distributed Computing (DISC ’98). Lecture Notes in Computer Science, vol. 1499, pp. 216–230. Springer, Berlin (1998) Google Scholar
  38. 38.
    IBM: IBM System/370 Extended Architecture, Principles of Operation (1983). Publication No. SA22-7085 Google Scholar
  39. 39.
    Larson, P.P.Å., Krishnan, M.: Memory allocation for long-running server applications. In: ISMM’98 Proc. of the 1st Int. Symp. on Memory Management. ACM SIGPLAN Notices, pp. 176–185. ACM, New York (1998) Google Scholar
  40. 40.
    Dechev, D., Pirkelbauer, P., Stroustrup, B.: Lock-free dynamically resizable arrays. In: Proc. of the 10th Int. Conf. on Principles of Distributed Systems OPODIS’06. LNCS, vol. 4305, pp. 142–156. Springer, Berlin (2006) Google Scholar

Copyright information

© The Author(s) 2009

Authors and Affiliations

  • Anders Gidenstam
    • 1
    Email author
  • Marina Papatriantafilou
    • 2
  • Philippas Tsigas
    • 2
  1. 1.Algorithms and ComplexityMax-Planck-Institut für InformatikSaarbrückenGermany
  2. 2.Computer Science and EngineeringChalmers University of TechnologyGöteborgSweden

Personalised recommendations