Advertisement

An Efficient Unbounded Lock-Free Queue for Multi-core Systems

  • Marco Aldinucci
  • Marco Danelutto
  • Peter Kilpatrick
  • Massimiliano Meneghin
  • Massimo Torquati
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7484)

Abstract

The use of efficient synchronization mechanisms is crucial for implementing fine grained parallel programs on modern shared cache multi-core architectures. In this paper we study this problem by considering Single-Producer/Single-Consumer (SPSC) coordination using unbounded queues. A novel unbounded SPSC algorithm capable of reducing the row synchronization latency and speeding up Producer-Consumer coordination is presented. The algorithm has been extensively tested on a shared-cache multi-core platform and a sketch proof of correctness is presented. The queues proposed have been used as basic building blocks to implement the FastFlow parallel framework, which has been demonstrated to offer very good performance for fine-grain parallel applications.

Keywords

Lock-free algorithms wait-free algorithms bounded and unbounded SPSC queues cache-coherent multi-cores 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Orozco, D.A., Garcia, E., Khan, R., Livingston, K., Gao, G.R.: Toward high-throughput algorithms on many-core architectures. TACO 8(4), 49 (2012)CrossRefGoogle Scholar
  2. 2.
    Moir, M., Nussbaum, D., Shalev, O., Shavit, N.: Using elimination to implement scalable and lock-free FIFO queues. In: Proc. of the 7th ACM Symposium on Parallelism in Algorithms and Architectures, pp. 253–262 (2005)Google Scholar
  3. 3.
    Ladan-Mozes, E., Shavit, N.: An optimistic approach to lock-free FIFO queues. Distributed Computing 20(5), 323–341 (2008)CrossRefGoogle Scholar
  4. 4.
    Prakash, S., Lee, Y.H., Johnson, T.: A nonblocking algorithm for shared queues using compare-and-swap. IEEE Trans. Comput. 43(5), 548–559 (1994)CrossRefGoogle Scholar
  5. 5.
    Tsigas, P., Zhang, Y.: A simple, fast and scalable non-blocking concurrent fifo queue for shared memory multiprocessor systems. In: Proc. of the 13th ACM Symposium on Parallel Algorithms and Architectures (SPAA), pp. 134–143 (2001)Google Scholar
  6. 6.
    Michael, M.M., Scott, M.L.: Nonblocking algorithms and preemption-safe locking on multiprogrammed shared memory multiprocessors. Journal of Parallel and Distributed Computing 51(1), 1–26 (1998)zbMATHCrossRefGoogle Scholar
  7. 7.
    Michael, M.M.: Hazard pointers: Safe memory reclamation for lock-free objects. IEEE Trans. Parallel Distrib. Syst. 15(6), 491–504 (2004)CrossRefGoogle Scholar
  8. 8.
    Kahn, G.: The semantics of simple language for parallel programming. In: IFIP Congress, pp. 471–475 (1974)Google Scholar
  9. 9.
    FastFlow framework: website (2009), http://mc-fastflow.sourceforge.net/
  10. 10.
    Thies, W., Karczmarek, M., Amarasinghe, S.: StreamIt: A Language for Streaming Applications. In: Horspool, R.N. (ed.) CC 2002. LNCS, vol. 2304, pp. 179–196. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  11. 11.
    Reinders, J.: Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism. O’Reilly (2007)Google Scholar
  12. 12.
    Lamport, L.: How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. 28(9), 690–691 (1979)zbMATHCrossRefGoogle Scholar
  13. 13.
    Giacomoni, J., Moseley, T., Vachharajani, M.: Fastforward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue. In: Proc. of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pp. 43–52 (2008)Google Scholar
  14. 14.
    Lee, P.P.C., Bu, T., Chandranmenon, G.P.: A lock-free, cache-efficient multi-core synchronization mechanism for line-rate network traffic monitoring. In: Proc. of the 24th Intl. Parallel and Distributed Processing Symposium, IPDPS (2010)Google Scholar
  15. 15.
    Aldinucci, M., Ruggieri, S., Torquati, M.: Porting Decision Tree Algorithms to Multicore Using FastFlow. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part I. LNCS, vol. 6321, pp. 7–23. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  16. 16.
    Aldinucci, M., Danelutto, M., Meneghin, M., Kilpatrick, P., Torquati, M.: Efficient streaming applications on multi-core with FastFlow: the biosequence alignment test-bed. In: Parallel Computing: From Multicores and GPU’s to Petascale. Advances in Parallel Computing, vol. 19, pp. 273–280. IOS Press (2009)Google Scholar
  17. 17.
    Lamport, L.: Concurrent reading and writing. CACM 20(11), 806–811 (1977)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Adve, S.V., Gharachorloo, K.: Shared memory consistency models: A tutorial. IEEE Computer 29, 66–76 (1995)CrossRefGoogle Scholar
  19. 19.
    Higham, L., Kawash, J.: Critical sections and producer/consumer queues in weak memory systems. In: Proc of the Intl. Symposium on Parallel Architectures, Algorithms and Networks (ISPAN), pp. 56–63. IEEE (1997)Google Scholar
  20. 20.
    Jablin, T.B., Zhang, Y., Jablin, J.A., Huang, J., Kim, H., August, D.I.: Liberty queues for epic architectures. In: Proc. of the 8th Workshop on Explicitly Parallel Instruction Computer Architectures and Compiler Technology, EPIC (2010)Google Scholar
  21. 21.
    Hendler, D., Shavit, N.: Work dealing. In: Proc. of the 4th ACM Symposium on Parallel Algorithms and Architectures (SPAA), pp. 164–172 (2002)Google Scholar
  22. 22.
    Torquati, M.: Single-producer/single-consumer queues on shared cache multi-core systems. Technical Report TR-10-20, Computer Science Dept., University of Pisa, Italy (2010), http://compass2.di.unipi.it/TR/Files/TR-10-20.pdf.gz

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Marco Aldinucci
    • 1
  • Marco Danelutto
    • 2
  • Peter Kilpatrick
    • 3
  • Massimiliano Meneghin
    • 4
  • Massimo Torquati
    • 2
  1. 1.Computer Science DepartmentUniversity of TorinoItaly
  2. 2.Computer Science DepartmentUniversity of PisaItaly
  3. 3.Computer Science DepartmentQueen’s University BelfastUK
  4. 4.IBM Dublin Research LabIreland

Personalised recommendations