The Adaptive Priority Queue with Elimination and Combining

  • Irina Calciu
  • Hammurabi Mendes
  • Maurice Herlihy
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8784)

Abstract

Priority queues are fundamental abstract data structures, often used to manage limited resources in parallel programming. Several proposed parallel priority queue implementations are based on skiplists, harnessing the potential for parallelism of the add() operations. In addition, methods such as Flat Combining have been proposed to reduce contention, batching together multiple operations to be executed by a single thread. While this technique can decrease lock-switching overhead and the number of pointer changes required by the removeMin() operations in the priority queue, it can also create a sequential bottleneck and limit parallelism, especially for non-conflicting add() operations.

In this paper, we describe a novel priority queue design, harnessing the scalability of parallel insertions in conjunction with the efficiency of batched removals. Moreover, we present a new elimination algorithm suitable for a priority queue, which further increases concurrency on balanced workloads with similar numbers of add() and removeMin() operations. We implement and evaluate our design using a variety of techniques including locking, atomic operations, hardware transactional memory, as well as employing adaptive heuristics given the workload.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Calciu, I., Dice, D., Harris, T., Herlihy, M., Kogan, A., Marathe, V., Moir, M.: Message passing or shared memory: Evaluating the delegation abstraction for multicores. In: Baldoni, R., Nisse, N., van Steen, M. (eds.) OPODIS 2013. LNCS, vol. 8304, pp. 83–97. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  2. 2.
    Calciu, I., Gottschlich, J., Herlihy, M.: Using delegation and elimination to implement a scalable numa-friendly stack. In: 5th USENIX Workshop on Hot Topics in Parallelism (2013)Google Scholar
  3. 3.
    Hendler, D., Incze, I., Shavit, N., Tzafrir, M.: Flat combining and the synch-ronization-parallelism tradeoff. In: Proceedings of the 22nd ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2010, pp. 355–364. ACM, New York (2010), http://doi.acm.org/10.1145/1810479.1810540 Google Scholar
  4. 4.
    Hendler, D., Shavit, N., Yerushalmi, L.: A scalable lock-free stack algorithm. J. Parallel Distrib. Comput. 70(1), 1–12 (2010), http://dx.doi.org/10.1016/j.jpdc.2009.08.011 CrossRefMATHGoogle Scholar
  5. 5.
    Herlihy, M., Moss, J.E.B.: Transactional memory: Architectural support for lock-free data structures. SIGARCH Comput. Archit. News 21(2), 289–300 (1993), http://doi.acm.org/10.1145/173682.165164 CrossRefGoogle Scholar
  6. 6.
    Hunt, G., Michael, M., Parthasarathy, S., Scott, M.: An efficient algorithm for concurrent priority queue heaps. Information Processing Letters 60(3), 151–157 (1996)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Intel Corporation: Transactional Synchronization in Haswell (September 8, 2012), http://software.intel.com/en-us/blogs/2012/02/07/transactional-synchronization-in-haswell/ (retrieved)
  8. 8.
    Lotan, I., Shavit, N.: Skiplist-based concurrent priority queues. In: Proc. of the 14th International Parallel and Distributed Processing Symposium (IPDPS), pp. 263–268 (2000)Google Scholar
  9. 9.
    Metreveli, Z., Zeldovich, N., Kaashoek, M.F.: Cphash: A cache-partitioned hash table. In: Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2012, pp. 319–320. ACM, New York (2012), http://doi.acm.org/10.1145/2145816.2145874 Google Scholar
  10. 10.
    Moir, M., Nussbaum, D., Shalev, O., Shavit, N.: Using elimination to implement scalable and lock-free fifo queues. In: Proceedings of the Seventeenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2005, pp. 253–262. ACM, New York (2005), http://doi.acm.org/10.1145/1073970.1074013 Google Scholar
  11. 11.
    Sundell, H., Tsigas, P.: Fast and lock-free concurrent priority queues for multi-thread systems. In: IEEE International Symposium on Parallel and Distributed Processing, p. 11 (April 2003)Google Scholar
  12. 12.
    Wang, A., Gaudet, M., Wu, P., Amaral, J.N., Ohmacht, M., Barton, C., Silvera, R., Michael, M.: Evaluation of blue gene/q hardware support for transactional memories. In: Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques, PACT 2012, pp. 127–136. ACM, New York (2012), http://doi.acm.org/10.1145/2370816.2370836 Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Irina Calciu
    • 1
  • Hammurabi Mendes
    • 1
  • Maurice Herlihy
    • 1
  1. 1.Department of Computer ScienceBrown UniversityUSA

Personalised recommendations