On the k-Independence Required by Linear Probing and Minwise Independence

  • Mihai Pǎtraşcu
  • Mikkel Thorup
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6198)


We show that linear probing requires 5-independent hash functions for expected constant-time performance, matching an upper bound of [Pagh et al. STOC’07]. For (1 + ε)-approximate minwise independence, we show that \(\Omega(\lg \frac{1}{\varepsilon})\)-independent hash functions are required, matching an upper bound of [Indyk, SODA’99]. We also show that the multiply-shift scheme of Dietzfelbinger, most commonly used in practice, fails badly in both applications.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alon, N., Nussboim, A.: k-wise independent random graphs. In: Proc. 49th IEEE Symposium on Foundations of Computer Science (FOCS), pp. 813–822 (2008)Google Scholar
  2. 2.
    Black, J.R., Martel, C.U., Qi, H.: Graph and hashing algorithms for modern architectures: Design and performance. In: Proc. 2nd International Workshop on Algorithm Engineering (WAE), pp. 37–48 (1998)Google Scholar
  3. 3.
    Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations. Journal of Computer and System Sciences 60(3), 630–659 (2000); See also STOC 1998Google Scholar
  4. 4.
    Broder, A.Z., Glassman, S.C., Manasse, M.S., Zweig, G.: Syntactic clustering of the web. Computer Networks 29, 1157–1166 (1997)Google Scholar
  5. 5.
    Cohen, E.: Size-estimation framework with applications to transitive closure and reachability. Journal of Computer and System Sciences 55(3), 441–453 (1997); See also STOC 1994Google Scholar
  6. 6.
    Dietzfelbinger, M.: Universal hashing and k-wise independent random variables via integer arithmetic without primes. In: Proc. 13th Symposium on Theoretical Aspects of Computer Science (STACS), pp. 569–580 (1996)Google Scholar
  7. 7.
    Heileman, G.L., Luo, W.: How caching affects hashing. In: Proc. 7th Workshop on Algorithm Engineering and Experiments (ALENEX), pp. 141–154 (2005)Google Scholar
  8. 8.
    Indyk, P.: A small approximately min-wise independent family of hash functions. Journal of Algorithms 38(1), 84–90 (2001); See also SODA 1999Google Scholar
  9. 9.
    Knuth, D.E.: Notes on open addressing (1963) (Unpublished memorandum), http://citeseer.ist.psu.edu/knuth63notes.html
  10. 10.
    Pagh, A., Pagh, R., Ružić, M.: Linear probing with constant independence. SIAM Journal on Computing 39(3), 1107–1120 (2009); See also STOC 2007Google Scholar
  11. 11.
    Pagh, R., Rodler, F.F.: Cuckoo hashing. Journal of Algorithms 51(2), 122–144 (2004); See also ESA 2001Google Scholar
  12. 12.
    Pǎtraşcu, M., Thorup, M.: The power of simple tabulation-based hashing (2010) (manuscript )Google Scholar
  13. 13.
    Schmidt, J.P., Siegel, A.: The analysis of closed hashing under limited randomness. In: Proc. 22nd ACM Symposium on Theory of Computing (STOC), pp. 224–234 (1990)Google Scholar
  14. 14.
    Schmidt, J.P., Siegel, A., Srinivasan, A.: Chernoff-Hoeffding bounds for applications with limited independence. SIAM Journal on Discrete Mathematics 8(2), 223–250 (1995); See also SODA 1993Google Scholar
  15. 15.
    Siegel, A., Schmidt, J.P.: Closed hashing is computable and optimally randomizable with universal hash functions. Technical Report TR1995-687, Currant Institute (1995)Google Scholar
  16. 16.
    Thorup, M.: Even strongly universal hashing is pretty fast. In: Proc. 11th ACM/SIAM Symposium on Discrete Algorithms (SODA), pp. 496–497 (2000)Google Scholar
  17. 17.
    Thorup, M., Zhang, Y.: Tabulation based 4-universal hashing with applications to second moment estimation. In: Proc. 15th ACM/SIAM Symposium on Discrete Algorithms (SODA), pp. 615–624 (2004)Google Scholar
  18. 18.
    Thorup, M., Zhang, Y.: Tabulation based 5-universal hashing and linear probing. In: Proc. 12th Workshop on Algorithm Engineering and Experiments, ALENEX (2009)Google Scholar
  19. 19.
    Wegman, M.N., Carter, L.: New classes and applications of hash functions. Journal of Computer and System Sciences 22(3), 265–279 (1981); see also FOCS 1979MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Mihai Pǎtraşcu
    • 1
  • Mikkel Thorup
    • 1
  1. 1.AT&T Labs 

Personalised recommendations