Advertisement

Cache-Oblivious Dictionaries and Multimaps with Negligible Failure Probability

  • Michael T. Goodrich
  • Daniel S. Hirschberg
  • Michael Mitzenmacher
  • Justin Thaler
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7659)

Abstract

A dictionary (or map) is a key-value store that requires all keys be unique, and a multimap is a key-value store that allows for multiple values to be associated with the same key. We design hashing-based indexing schemes for dictionaries and multimaps that achieve worst-case optimal performance for lookups and updates, with minimal space overhead and sub-polynomial probability that the data structure will require a rehash operation. Our dictionary structure is designed for the Random Access Machine (RAM) model, while our multimap implementation is designed for the cache-oblivious external memory (I/O) model. The failure probabilities for our structures are sub-polynomial, which can be useful in cryptographic or data-intensive applications.

Keywords

Hash Function Failure Probability Hash Table Hash Family Membership Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Andersson, A., Miltersen, P.B., Riis, S., Thorup, M.: Static Dictionaries on AC0 RAMs: Query Time \(\Theta(\sqrt{\log n/\log \log n})\) is Necessary and Sufficient. In: Proc. of FOCS, pp. 441–450 (1996)Google Scholar
  2. 2.
    Andersson, A., Miltersen, P.B., Thorup, M.: Fusion trees can be implemented with AC0 instructions only. Theoretical Computer Science 215(1-2), 337–344 (1999)MathSciNetzbMATHCrossRefGoogle Scholar
  3. 3.
    Angelino, E., Goodrich, M.T., Mitzenmacher, M., Thaler, J.: External-Memory Multimaps. In: Asano, T., Nakano, S.-i., Okamoto, Y., Watanabe, O. (eds.) ISAAC 2011. LNCS, vol. 7074, pp. 384–394. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  4. 4.
    Arbitman, Y., Naor, M., Segev, G.: De-amortized Cuckoo Hashing: Provable Worst-Case Performance and Experimental Results. In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds.) ICALP 2009, Part I. LNCS, vol. 5555, pp. 107–118. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Arbitman, Y., Naor, M., Segev, G.: Backyard cuckoo hashing: Constant worst-case operations with a succinct representation. In: Proc. of FOCS, pp. 787–796 (2010)Google Scholar
  6. 6.
    Bender, M.A., Demaine, E.D., Farach-Colton, M.: Cache-oblivious b-trees. In: Proc. of FOCS, pp. 399–409 (2000)Google Scholar
  7. 7.
    Brodal, G.S., Demaine, E.D., Munro, I.: Fast allocation and deallocation with an improved buddy system. Acta Inf. 41, 273–291 (2005)MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Brodal, G.S., Fagerberg, R., Jacob, R.: Cache oblivious search trees via binary trees of small height. In: Proc. of SODA, pp. 39–48 (2002)Google Scholar
  9. 9.
    Brodal, G.S., Fagerberg, R., Vinther, K.: Engineering a cache-oblivious sorting algorithm. J. Exp. Algorithmics 12, 2.2:1–2.2:23 (2008)Google Scholar
  10. 10.
    Büttcher, S., Clarke, C.L.A.: Indexing time vs. query time: trade-offs in dynamic information retrieval systems. In: Proc. of CIKM, pp. 317–318 (2005)Google Scholar
  11. 11.
    Büttcher, S., Clarke, C.L.A., Lushman, B.: Hybrid index maintenance for growing text collections. In: Proc. of SIGIR, pp. 356–363 (2006)Google Scholar
  12. 12.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (2001)zbMATHGoogle Scholar
  13. 13.
    Cutting, D., Pedersen, J.: Optimization for dynamic inverted index maintenance. In: Proc. of SIGIR, pp. 405–411 (1990)Google Scholar
  14. 14.
    Fredman, M.L., Willard, D.E.: Surpassing the information theoretic bound with fusion trees. J. Comput. System Sci. 47, 424–436 (1993)MathSciNetzbMATHCrossRefGoogle Scholar
  15. 15.
    Frigo, M., Leiserson, C.E., Prokop, H., Ramachandran, S.: Cache-oblivious algorithms. In: Proc. of FOCS, pp. 285–298 (1999)Google Scholar
  16. 16.
    Goodrich, M.T., Hirschberg, D.S., Mitzenmacher, M., Thaler, J.: Fully de-amortized cuckoo hashing for cache-oblivious dictionaries and multimaps. CoRR, abs/1107.4378 (2011)Google Scholar
  17. 17.
    Goodrich, M.T., Mitzenmacher, M.: Privacy-Preserving Access of Outsourced Data via Oblivious RAM Simulation. In: Aceto, L., Henzinger, M., Sgall, J. (eds.) ICALP 2011, Part II. LNCS, vol. 6756, pp. 576–587. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  18. 18.
    Guo, R., Cheng, X., Xu, H., Wang, B.: Efficient on-line index maintenance for dynamic text collections by using dynamic balancing tree. In: Proc. of CIKM, pp. 751–760 (2007)Google Scholar
  19. 19.
    Kirsch, A., Mitzenmacher, M.: Using a queue to de-amortize cuckoo hashing in hardware. In: Proc. of 45th Allerton Conference, pp. 751–758 (2007)Google Scholar
  20. 20.
    Kirsch, A., Mitzenmacher, M., Wieder, U.: More robust hashing: cuckoo hashing with a stash. SIAM J. Comput. 39, 1543–1561 (2009)MathSciNetzbMATHCrossRefGoogle Scholar
  21. 21.
    Knuth, D.E.: Sorting and Searching. The Art of Computer Programming, vol. 3. Addison-Wesley, Reading (1973)Google Scholar
  22. 22.
    Lester, N., Moffat, A., Zobel, J.: Efficient online index construction for text databases. ACM Trans. Database Syst. 33, 19:1–19:33 (2008)Google Scholar
  23. 23.
    Lester, N., Zobel, J., Williams, H.: Efficient online index maintenance for contiguous inverted lists. Inf. Processing & Management 42(4), 916–933 (2006)CrossRefGoogle Scholar
  24. 24.
    Luk, R.W., Lam, W.: Efficient in-memory extensible inverted file. Information Systems 32(5), 733–754 (2007)CrossRefGoogle Scholar
  25. 25.
    Mitzenmacher, M., Upfal, E.: Probability and computing - randomized algorithms and probabilistic analysis. Cambridge University Press (2005)Google Scholar
  26. 26.
    Pagh, R., Rodler, F.: Cuckoo hashing. Journal of Algorithms 52, 122–144 (2004)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Pagh, R., Wei, Z., Yi, K., Zhang, Q.: Cache-oblivious hashing. In: Proc. of PODS, pp. 297–304 (2010)Google Scholar
  28. 28.
    Rao Kosaraju, S., Pop, M.: De-amortization of Algorithms. In: Hsu, W.-L., Kao, M.-Y. (eds.) COCOON 1998. LNCS, vol. 1449, pp. 4–14. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  29. 29.
    Siegel, A.: On universal classes of extremely random constant-time hash functions. SIAM J. Comput. 33(3), 505–543 (2004)MathSciNetzbMATHCrossRefGoogle Scholar
  30. 30.
    Thorup, M.: On AC0 implementations of fusion trees and atomic heaps. In: Proc. of SODA, pp. 699–707 (2003)Google Scholar
  31. 31.
    Willard, D.E.: Examining computational geometry, van emde boas trees, and hashing from the perspective of the fusion tree. SIAM J. Comput. 29, 1030–1049 (1999)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Comput. Surv. 38 (July 2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Michael T. Goodrich
    • 1
  • Daniel S. Hirschberg
    • 1
  • Michael Mitzenmacher
    • 2
  • Justin Thaler
    • 2
  1. 1.Dept. of Computer ScienceUniversity of CaliforniaIrvineUSA
  2. 2.School of Engineering and Applied SciencesHarvard UniversityUSA

Personalised recommendations