Cache-Oblivious Comparison-Based Algorithms on Multisets

  • Arash Farzan
  • Paolo Ferragina
  • Gianni Franceschini
  • J. Ian Munro
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3669)


We study three comparison-based problems related to multisets in the cache-oblivious model: Duplicate elimination, multisorting and finding the most frequent element (the mode). We are interested in minimizing the cache complexity (or number of cache misses) of algorithms for these problems in the context under which cache size and block size are unknown. We give algorithms with cache complexities within a constant factor of the optimal for all the problems. In the case of determining the mode, the optimal algorithm is randomized as the deterministic algorithm differs from the lower bound by a sublogarithmic factor. We can achieve optimality either with a randomized method or if given, along with the input, lg lg of relative frequency of the mode with a constant additive error.


Internal Node Distinct Element Cache Size Cache Memory Complete Binary Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aggarwal, A., Vitter, J.S.: The I/O complexity of sorting and related problems. In: ICALP 1987. LNCS, vol. 267, pp. 467–478. Springer, Heidelberg (1987)Google Scholar
  2. 2.
    Frigo, M., Leiserson, C.E., Prokop, H., Ramachandran, S.: Cache-oblivious algorithms. In: FOCS Proceedings, pp. 285–297. IEEE Computer Society Press, Los Alamitos (1999)Google Scholar
  3. 3.
    Sleator, D.D., Tarjan, R.E.: Amortized efficiency of list update and paging rules. Commun. ACM 28(2), 202–208 (1985)CrossRefMathSciNetGoogle Scholar
  4. 4.
    Munro, I., Spira, P.: Sorting and searching in multisets. SIAM Journal on Computing 5, 1–8 (1976)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Arge, L., Knudsen, M., Larsen, K.: A general lower bound on the I/O-complexity of comparison-based algorithms. In: Proceedings of WADS. Springer, Heidelberg (1993)Google Scholar
  6. 6.
    Brodal, F.: Cache oblivious distribution sweeping. In: Widmayer, P., et al. (eds.) ICALP 2002. LNCS, vol. 2380, p. 426. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Demaine, E.D.: Cache-oblivious algorithms and data structures. In: Lecture Notes from the EEF Summer School on Massive Data Sets, BRICS, University of Aarhus, Denmark. LNCS. Springer, Heidelberg (2002)Google Scholar
  8. 8.
    Bender, M.A., Demaine, E.D., Farach-Colton, M.: Cache-oblivious B-trees. In: IEEE (eds.) Annual Symposium on Foundations of Computer Science 2000, pp. 399–409. IEEE Computer Society Press, Los Alamitos (2000)CrossRefGoogle Scholar
  9. 9.
    Misra, J., Gries, D.: Finding repeated elements. Science of Computer Programming 2, 143–152 (1982)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Arash Farzan
    • 1
  • Paolo Ferragina
    • 2
  • Gianni Franceschini
    • 2
  • J. Ian Munro
    • 1
  1. 1.School of Computer ScienceUniversity of Waterloo 
  2. 2.Department of Computer ScienceUniversity of Pisa 

Personalised recommendations