A Bulk-Parallel Priority Queue in External Memory with STXXL

  • Timo BingmannEmail author
  • Thomas Keh
  • Peter Sanders
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9125)


We propose the design and an implementation of a bulk-parallel external memory priority queue to take advantage of both shared-memory parallelism and high external memory transfer speeds to parallel disks. To achieve higher performance by decoupling item insertions and extractions, we offer two parallelization interfaces: one using “bulk” sequences, the other by defining “limit” items. In the design, we discuss how to parallelize insertions using multiple heaps, and how to calculate a dynamic prediction sequence to prefetch blocks and apply parallel multiway merge for extraction. Our experimental results show that in the selected benchmarks the priority queue reaches 64% of the full parallel I/O bandwidth of SSDs and 49% of rotational disks, or the speed of sorting in external memory when bounded by computation.


Priority Queue External Memory Internal Memory Small Item Parallel Disk 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alistarh, D., Kopinsky, J., Li, J., Shavit, N.: The SprayList: A scalable relaxed priority queue. Tech. Rep. MSR-TR-2014-16, Microsoft Research, September 2014Google Scholar
  2. 2.
    Arge, L.: The buffer tree: A technique for designing batched external data structures. Algorithmica 37(1), 1–24 (2003)zbMATHMathSciNetCrossRefGoogle Scholar
  3. 3.
    Arge, L., Goodrich, M.T., Nelson, M., Sitchinava, N.: Fundamental parallel algorithms for private-cache chip multiprocessors. In: SPAA, pp. 197–206. ACM (2008)Google Scholar
  4. 4.
    Beckmann, A., Dementiev, R., Singler, J.: Building a parallel pipelined external memory algorithm library. In: IPDPS 2009, pp. 1–10. IEEE (2009)Google Scholar
  5. 5.
    Bingmann, T., Fischer, J., Osipov, V.: Inducing suffix and LCP arrays in external memory. In: ALENEX 2013, pp. 88–102. SIAM (2013)Google Scholar
  6. 6.
    Bingmann, T., Keh, T., Sanders, P.: A bulk-parallel priority queue in external memory with STXXL, April 2015. see ArXiv e-print arXiv:1504.00545
  7. 7.
    Brodal, G.S., Katajainen, J.: Worst-case efficient external-memory priority queues. In: Arnborg, S. (ed.) SWAT 1998. LNCS, vol. 1432, pp. 107–118. Springer, Heidelberg (1998) CrossRefGoogle Scholar
  8. 8.
    Chiang, Y.J., Goodrich, M.T., Grove, E.F., Tamassia, R., Vengroff, D.E., Vitter, J.S.: External-memory graph algorithms. In: SODA 1995, pp. 139–149. SIAM (1995)Google Scholar
  9. 9.
    Dementiev, R., Kettner, L., Sanders, P.: STXXL: Standard template library for XXL data sets. Software & Practice and Experience 38(6), 589–637 (2008)CrossRefGoogle Scholar
  10. 10.
    Dementiev, R., Sanders, P.: Asynchronous parallel disk sorting. In: SPAA 2003, pp. 138–148. ACM (2003)Google Scholar
  11. 11.
    Deo, N., Prasad, S.: Parallel heap: An optimal parallel priority queue. The Journal of Supercomputing 6(1), 87–98 (1992)zbMATHCrossRefGoogle Scholar
  12. 12.
    Hutchinson, D.A., Sanders, P., Vitter, J.S.: Duality between prefetching and queued writing with parallel disks. SIAM Journal on Computing 34(6) (2005)Google Scholar
  13. 13.
    Keh, T.: Bulk-parallel priority queue in external memory, Bachelor Thesis, Karlsruhe Institute of Technology, Germany (2014)Google Scholar
  14. 14.
    Petersen, L.H.: External Priority Queues in Practice. Master’s thesis, Aarhus Universitet, Datalogisk Institut, Denmark (2007)Google Scholar
  15. 15.
    Pinotti, M.C., Pucci, G.: Parallel priority queues. IPL 40(1), 33–40 (1991)zbMATHMathSciNetGoogle Scholar
  16. 16.
    Rihani, H., Sanders, P., Dementiev, R.: Multiqueues: Simpler, faster, and better relaxed concurrent priority queues. arXiv preprint arXiv:1411.1209 (2014)
  17. 17.
    Sanders, P.: Randomized priority queues for fast parallel access. Journal of Parallel and Distributed Computing 49(1), 86–97 (1998)zbMATHCrossRefGoogle Scholar
  18. 18.
    Sanders, P.: Fast priority queues for cached memory. JEA 5, 7 (2000)CrossRefGoogle Scholar
  19. 19.
    Singler, J., Sanders, P., Putze, F.: MCSTL: the multi-core standard template library. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 682–694. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  20. 20.
    Varman, P.J., Scheufler, S.D., Iyer, B.R., Ricard, G.R.: Merging multiple lists on hierarchical-memory multiprocessors. Journal of Parallel and Distributed Computing 12(2), 171–177 (1991)zbMATHCrossRefGoogle Scholar
  21. 21.
    Vitter, J.S., Shriver, E.A.: Algorithms for parallel memory, i: Two-level memories. Algorithmica 12(2–3), 110–147 (1994)zbMATHMathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Karlsruhe Institute of TechnologyKarlsruheGermany

Personalised recommendations