Abstract
A Bloom filter is a method for reducing the space (memory) required for representing a set by allowing a small error probability. In this paper we consider a Sliding Bloom Filter: a data structure that, given a stream of elements, supports membership queries of the set of the last n elements (a sliding window), while allowing a small error probability and a slackness parameter. The problem of sliding Bloom filters has appeared in the literature in several communities, but this work is the first theoretical investigation of it. We formally define the data structure and its relevant parameters and analyze the time and memory requirements needed to achieve them. We give a low space construction that runs in \(O(1)\) time per update with high probability (that is, for all sequences with high probability all operations take constant time) and provide an almost matching lower bound on the space that shows that our construction has the best possible space consumption up to an additive lower order term.
Similar content being viewed by others
Notes
The entropy lower bound is base 2 logarithm of the size of the set of all possible inputs. In our case, all possible pairs \((S,\pi )\).
See discussion at Sect. 2.1.
References
Arbitman, Y., Naor, M., Segev, G.: Backyard cuckoo hashing: constant worst-case operations with a succinct representation. FOCS, pp. 787–796 (2010)
Alon, N., Spencer, J.H.: The Probabilistic Method, 3rd edn. Wiley Series in Discrete Mathematics and Optimization. Wiley (2008)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Broder, A.Z., Mitzenmacher, M.: Survey: network applications of Bloom filters: a survey. Internet Math. 1(4), 485–509 (2003)
Carter, L., Floyd, R.W., Gill, J., Markowsky, G., Wegman, M.N.: Exact and approximate membership testers. STOC, pp. 59–65 (1978)
Chang, F., Li, K., Feng, W.-C.: Approximate caches for packet classification. INFOCOM (2004)
Carter, J.L., Wegman, M.N.: Universal classes of hashfunctions. J. Comput. Syst. Sci. 18(2), 143–154 (1979)
Demaine, E.: Lecture notes for the course “Advanced data structures”. http://courses.csail.mit.edu/6.851/spring07/scribe/lec21.pdf (2007)
Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM J. Comput. 31(6), 1794–1813 (2002)
Dietzfelbinger, M., Pagh, R.: Succinct data structures for retrieval and approximate membership. ICALP, pp. 385–396 (2008)
Deng, F., Rafiei, D.: Approximately detecting duplicates for streaming data using stable Bloom filters. SIGMOD, pp. 25–36 (2006)
Fan, L., Cao, P., Almeida, J.M., Broder, A.Z.: Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Trans. Netw. 8(3), 281–293 (2000)
Lovett, S., Porat, E.: A lower bound for dynamic approximate membership data structures. FOCS, pp. 797–804 (2010)
Metwally, A., Agrawal, D., El Abbadi, A.: Duplicate detection in click streams. In: Proceedings of the 14th International Conference on World Wide Web. ACM Press, pp. 12–21 (2005)
Pagh, A., Pagh, R., Rao, S.S.: An optimal Bloom filter replacement. SODA, pp. 823–829 (2005)
Pagh, R., Segev, G., Wieder, U.: How to approximate a set without knowing its size in advance. FOCS, pp. 80–89 (2013)
Thorup, M.: Timeouts with time-reversed linear probing. INFOCOM, pp. 166–170 (2011)
Tarkoma, S., Rothenberg, C.E., Lagerspetz, E.: Theory and practice of Bloom filters for distributed systems. IEEE Commun. Surv. Tutor. 14(1), 131–155 (2012)
Yoon, M.: Aging Bloom filter with two active buffers for dynamic sets. IEEE Trans. Knowl. Data Eng. 22(1), 134–138 (2010)
Zhang, L., Guan, Y.: Detecting click fraud in pay-per-click streams of online advertising networks. ICDCS, pp. 77–84 (2008)
Acknowledgments
We thank Ilan Komargodski, Tal Wagner, Ohad Shamir and the anonymous referees for many useful comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Moni Naor: Incumbent of the Judith Kleeman Professorial Chair. Research supported in part by a Grant from the Israel Science Foundation.
Research supported in part by a Grant from the I-CORE Program of the Planning and Budgeting Committee, the Israel Science Foundation and the Citi Foundation. A preliminary version of this paper appeared in ISAAC 2013.
Rights and permissions
About this article
Cite this article
Naor, M., Yogev, E. Tight Bounds for Sliding Bloom Filters. Algorithmica 73, 652–672 (2015). https://doi.org/10.1007/s00453-015-0007-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-015-0007-9