Advertisement

Algorithmica

, Volume 73, Issue 4, pp 652–672 | Cite as

Tight Bounds for Sliding Bloom Filters

  • Moni Naor
  • Eylon YogevEmail author
Article
  • 187 Downloads

Abstract

A Bloom filter is a method for reducing the space (memory) required for representing a set by allowing a small error probability. In this paper we consider a Sliding Bloom Filter: a data structure that, given a stream of elements, supports membership queries of the set of the last n elements (a sliding window), while allowing a small error probability and a slackness parameter. The problem of sliding Bloom filters has appeared in the literature in several communities, but this work is the first theoretical investigation of it. We formally define the data structure and its relevant parameters and analyze the time and memory requirements needed to achieve them. We give a low space construction that runs in \(O(1)\) time per update with high probability (that is, for all sequences with high probability all operations take constant time) and provide an almost matching lower bound on the space that shows that our construction has the best possible space consumption up to an additive lower order term.

Keywords

Data structures Bloom filter Streaming algorithms Lower bounds Hash tables 

Notes

Acknowledgments

We thank Ilan Komargodski, Tal Wagner, Ohad Shamir and the anonymous referees for many useful comments.

References

  1. 1.
    Arbitman, Y., Naor, M., Segev, G.: Backyard cuckoo hashing: constant worst-case operations with a succinct representation. FOCS, pp. 787–796 (2010)Google Scholar
  2. 2.
    Alon, N., Spencer, J.H.: The Probabilistic Method, 3rd edn. Wiley Series in Discrete Mathematics and Optimization. Wiley (2008)Google Scholar
  3. 3.
    Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)CrossRefzbMATHGoogle Scholar
  4. 4.
    Broder, A.Z., Mitzenmacher, M.: Survey: network applications of Bloom filters: a survey. Internet Math. 1(4), 485–509 (2003)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Carter, L., Floyd, R.W., Gill, J., Markowsky, G., Wegman, M.N.: Exact and approximate membership testers. STOC, pp. 59–65 (1978)Google Scholar
  6. 6.
    Chang, F., Li, K., Feng, W.-C.: Approximate caches for packet classification. INFOCOM (2004)Google Scholar
  7. 7.
    Carter, J.L., Wegman, M.N.: Universal classes of hashfunctions. J. Comput. Syst. Sci. 18(2), 143–154 (1979)CrossRefMathSciNetzbMATHGoogle Scholar
  8. 8.
    Demaine, E.: Lecture notes for the course “Advanced data structures”. http://courses.csail.mit.edu/6.851/spring07/scribe/lec21.pdf (2007)
  9. 9.
    Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM J. Comput. 31(6), 1794–1813 (2002)CrossRefMathSciNetzbMATHGoogle Scholar
  10. 10.
    Dietzfelbinger, M., Pagh, R.: Succinct data structures for retrieval and approximate membership. ICALP, pp. 385–396 (2008)Google Scholar
  11. 11.
    Deng, F., Rafiei, D.: Approximately detecting duplicates for streaming data using stable Bloom filters. SIGMOD, pp. 25–36 (2006)Google Scholar
  12. 12.
    Fan, L., Cao, P., Almeida, J.M., Broder, A.Z.: Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Trans. Netw. 8(3), 281–293 (2000)CrossRefGoogle Scholar
  13. 13.
    Lovett, S., Porat, E.: A lower bound for dynamic approximate membership data structures. FOCS, pp. 797–804 (2010)Google Scholar
  14. 14.
    Metwally, A., Agrawal, D., El Abbadi, A.: Duplicate detection in click streams. In: Proceedings of the 14th International Conference on World Wide Web. ACM Press, pp. 12–21 (2005)Google Scholar
  15. 15.
    Pagh, A., Pagh, R., Rao, S.S.: An optimal Bloom filter replacement. SODA, pp. 823–829 (2005)Google Scholar
  16. 16.
    Pagh, R., Segev, G., Wieder, U.: How to approximate a set without knowing its size in advance. FOCS, pp. 80–89 (2013)Google Scholar
  17. 17.
    Thorup, M.: Timeouts with time-reversed linear probing. INFOCOM, pp. 166–170 (2011)Google Scholar
  18. 18.
    Tarkoma, S., Rothenberg, C.E., Lagerspetz, E.: Theory and practice of Bloom filters for distributed systems. IEEE Commun. Surv. Tutor. 14(1), 131–155 (2012)CrossRefGoogle Scholar
  19. 19.
    Yoon, M.: Aging Bloom filter with two active buffers for dynamic sets. IEEE Trans. Knowl. Data Eng. 22(1), 134–138 (2010)CrossRefGoogle Scholar
  20. 20.
    Zhang, L., Guan, Y.: Detecting click fraud in pay-per-click streams of online advertising networks. ICDCS, pp. 77–84 (2008)Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Department of Computer Science and Applied MathematicsWeizmann Institute of ScienceRehovotIsrael

Personalised recommendations