Advertisement

Cache-, Hash- and Space-Efficient Bloom Filters

  • Felix Putze
  • Peter Sanders
  • Johannes Singler
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4525)

Abstract

A Bloom filter is a very compact data structure that supports approximate membership queries on a set, allowing false positives.

We propose several new variants of Bloom filters and replacements with similar functionality. All of them have a better cache-efficiency and need less hash bits than regular Bloom filters. Some use SIMD functionality, while the others provide an even better space efficiency. As a consequence, we get a more flexible trade-off between false positive rate, space-efficiency, cache-efficiency, hash-efficiency, and computational effort. We analyze the efficiency of Bloom filters and the proposed replacements in detail, in terms of the false positive rate, the number of expected cache-misses, and the number of required hash bits. We also describe and experimentally evaluate the performance of highly-tuned implementations. For many settings, our alternatives perform better than the methods proposed so far.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bloom, B.H.: Space-time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7) (1970)Google Scholar
  2. Broder, A., Mitzenmacher, M.: Network applications of bloom filters: A survey. Internet Mathematics, 1(4) (2004)Google Scholar
  3. Cleary, J.G.: Compact hash tables using bidirectional linear probing. IEEE Transactions on Computers 33(9), 828–834 (1984)zbMATHGoogle Scholar
  4. Dillinger, P.C., Manolios, P.: Bloom filters in probabilistic verification. In: Hu, A.J., Martin, A.K. (eds.) FMCAD 2004. LNCS, vol. 3312, pp. 367–381. Springer, Heidelberg (2004)Google Scholar
  5. Dillinger, P.C., Manolios, P.: Fast and accurate bitstate verification for SPIN. In: Graf, S., Mounier, L. (eds.) Model Checking Software. LNCS, vol. 2989, pp. 57–75. Springer, Heidelberg (2004)Google Scholar
  6. Fan, L., Cao, P., Almeida, J.M., Broder, A.Z.: Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM TON 8(3), 281–293 (2000)CrossRefGoogle Scholar
  7. Kirsch, A., Mitzenmacher, M.: Less hashing, same performance: Building a better Bloom filter. In: Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 456–467. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. Manber, U., Wu, S.: An algorithm for approximate membership checking with application to password security. Information Processing Letters 50(4), 191–197 (1994)zbMATHCrossRefGoogle Scholar
  9. Mitzenmacher, M.: Compressed Bloom filters. In: PODC 2001, pp. 144–150 (2001)Google Scholar
  10. Moffat, A., Turpin, A.: Compression and Coding Algorithms. Kluwer Academic Publishers, Dordrecht (2002)Google Scholar
  11. Pagh, A., Pagh, R., Rao, S.S.: An optimal Bloom filter replacement. In: SODA 2005, pp. 823–829 (2005)Google Scholar
  12. Sanders, P., Transier, F.: Intersection in integer inverted indices. In: ALENEX 2007 (2007)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Felix Putze
    • 1
  • Peter Sanders
    • 1
  • Johannes Singler
    • 1
  1. 1.Fakultät für Informatik, Universität KarlsruheGermany

Personalised recommendations