Skip to main content
Log in

Tight Bounds for Sliding Bloom Filters

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

A Bloom filter is a method for reducing the space (memory) required for representing a set by allowing a small error probability. In this paper we consider a Sliding Bloom Filter: a data structure that, given a stream of elements, supports membership queries of the set of the last n elements (a sliding window), while allowing a small error probability and a slackness parameter. The problem of sliding Bloom filters has appeared in the literature in several communities, but this work is the first theoretical investigation of it. We formally define the data structure and its relevant parameters and analyze the time and memory requirements needed to achieve them. We give a low space construction that runs in \(O(1)\) time per update with high probability (that is, for all sequences with high probability all operations take constant time) and provide an almost matching lower bound on the space that shows that our construction has the best possible space consumption up to an additive lower order term.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. The entropy lower bound is base 2 logarithm of the size of the set of all possible inputs. In our case, all possible pairs \((S,\pi )\).

  2. See discussion at Sect. 2.1.

References

  1. Arbitman, Y., Naor, M., Segev, G.: Backyard cuckoo hashing: constant worst-case operations with a succinct representation. FOCS, pp. 787–796 (2010)

  2. Alon, N., Spencer, J.H.: The Probabilistic Method, 3rd edn. Wiley Series in Discrete Mathematics and Optimization. Wiley (2008)

  3. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)

    Article  MATH  Google Scholar 

  4. Broder, A.Z., Mitzenmacher, M.: Survey: network applications of Bloom filters: a survey. Internet Math. 1(4), 485–509 (2003)

    Article  MathSciNet  Google Scholar 

  5. Carter, L., Floyd, R.W., Gill, J., Markowsky, G., Wegman, M.N.: Exact and approximate membership testers. STOC, pp. 59–65 (1978)

  6. Chang, F., Li, K., Feng, W.-C.: Approximate caches for packet classification. INFOCOM (2004)

  7. Carter, J.L., Wegman, M.N.: Universal classes of hashfunctions. J. Comput. Syst. Sci. 18(2), 143–154 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  8. Demaine, E.: Lecture notes for the course “Advanced data structures”. http://courses.csail.mit.edu/6.851/spring07/scribe/lec21.pdf (2007)

  9. Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM J. Comput. 31(6), 1794–1813 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  10. Dietzfelbinger, M., Pagh, R.: Succinct data structures for retrieval and approximate membership. ICALP, pp. 385–396 (2008)

  11. Deng, F., Rafiei, D.: Approximately detecting duplicates for streaming data using stable Bloom filters. SIGMOD, pp. 25–36 (2006)

  12. Fan, L., Cao, P., Almeida, J.M., Broder, A.Z.: Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Trans. Netw. 8(3), 281–293 (2000)

    Article  Google Scholar 

  13. Lovett, S., Porat, E.: A lower bound for dynamic approximate membership data structures. FOCS, pp. 797–804 (2010)

  14. Metwally, A., Agrawal, D., El Abbadi, A.: Duplicate detection in click streams. In: Proceedings of the 14th International Conference on World Wide Web. ACM Press, pp. 12–21 (2005)

  15. Pagh, A., Pagh, R., Rao, S.S.: An optimal Bloom filter replacement. SODA, pp. 823–829 (2005)

  16. Pagh, R., Segev, G., Wieder, U.: How to approximate a set without knowing its size in advance. FOCS, pp. 80–89 (2013)

  17. Thorup, M.: Timeouts with time-reversed linear probing. INFOCOM, pp. 166–170 (2011)

  18. Tarkoma, S., Rothenberg, C.E., Lagerspetz, E.: Theory and practice of Bloom filters for distributed systems. IEEE Commun. Surv. Tutor. 14(1), 131–155 (2012)

    Article  Google Scholar 

  19. Yoon, M.: Aging Bloom filter with two active buffers for dynamic sets. IEEE Trans. Knowl. Data Eng. 22(1), 134–138 (2010)

    Article  Google Scholar 

  20. Zhang, L., Guan, Y.: Detecting click fraud in pay-per-click streams of online advertising networks. ICDCS, pp. 77–84 (2008)

Download references

Acknowledgments

We thank Ilan Komargodski, Tal Wagner, Ohad Shamir and the anonymous referees for many useful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eylon Yogev.

Additional information

Moni Naor: Incumbent of the Judith Kleeman Professorial Chair. Research supported in part by a Grant from the Israel Science Foundation.

Research supported in part by a Grant from the I-CORE Program of the Planning and Budgeting Committee, the Israel Science Foundation and the Citi Foundation. A preliminary version of this paper appeared in ISAAC 2013.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Naor, M., Yogev, E. Tight Bounds for Sliding Bloom Filters. Algorithmica 73, 652–672 (2015). https://doi.org/10.1007/s00453-015-0007-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-015-0007-9

Keywords

Navigation