Skip to main content

Sliding Bloom Filters

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8283))

Abstract

A Bloom filter is a method for reducing the space (memory) required for representing a set by allowing a small error probability. In this paper we consider a Sliding Bloom Filter: a data structure that, given a stream of elements, supports membership queries of the set of the last n elements (a sliding window), while allowing a small error probability and a slackness parameter. The problem of sliding Bloom filters has appeared in the literature in several communities, but this work is the first theoretical investigation of it.

We formally define the data structure and its relevant parameters and analyze the time and memory requirements needed to achieve them. We give a low space construction that runs in O(1) time per update with high probability (that is, for all sequences with high probability all operations take constant time) and provide an almost matching lower bound on the space that shows that our construction has the best possible space consumption up to an additive lower order term.

Research supported in part by a grant from the I-CORE Program of the Planning and Budgeting Committee, the Israel Science Foundation and the Citi Foundation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arbitman, Y., Naor, M., Segev, G.: Backyard cuckoo hashing: Constant worst-case operations with a succinct representation. In: 2010 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 787–796. IEEE (2010)

    Google Scholar 

  2. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Communications of the ACM 13(7), 422–426 (1970)

    Google Scholar 

  3. Broder, A., Mitzenmacher, M.: Network applications of bloom filters: A survey. In: Internet Mathematics, pp. 636–646 (2002)

    Google Scholar 

  4. Carter, L., Floyd, R., Gill, J., Markowsky, G., Wegman, M.: Exact and approximate membership testers. In: Proceedings of the Tenth Annual ACM Symposium on Theory of Computing, pp. 59–65. ACM (1978)

    Google Scholar 

  5. Chang, F., Feng, W.-C., Li, K.: Approximate caches for packet classification. In: Twenty-Third Annual Joint Conference of the IEEE Computer and Communications Societies, INFOCOM 2004, vol. 4, pp. 2196–2207. IEEE (2004)

    Google Scholar 

  6. Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM Journal on Computing 31(6), 1794–1813 (2002)

    Google Scholar 

  7. Dietzfelbinger, M., Pagh, R.: Succinct data structures for retrieval and approximate membership. In: Aceto, L., Damgård, I., Goldberg, L.A., Halldórsson, M.M., Ingólfsdóttir, A., Walukiewicz, I. (eds.) ICALP 2008, Part I. LNCS, vol. 5125, pp. 385–396. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  8. Deng, F., Rafiei, D.: Approximately detecting duplicates for streaming data using stable bloom filters. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pp. 25–36. ACM (2006)

    Google Scholar 

  9. Fan, L., Cao, P., Almeida, J., Broder, A.Z.: Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Trans. Netw. 8(3), 281–293 (2000)

    Google Scholar 

  10. Lovett, S., Porat, E.: A lower bound for dynamic approximate membership data structures. In: 2010 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 797–804. IEEE (2010)

    Google Scholar 

  11. Metwally, A., Agrawal, D., Abbadi, A.E.: Duplicate detection in click streams. In: Proceedings of the 14th International Conference on World Wide Web, pp. 12–21. ACM (2005)

    Google Scholar 

  12. Naor, M., Yogev, E.: Sliding bloom filters. CoRR, abs/1304.5872 (2013)

    Google Scholar 

  13. Pagh, A., Pagh, R., Srinivasa Rao, S.: An optimal bloom filter replacement. In: Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 823–829. Society for Industrial and Applied Mathematics (2005)

    Google Scholar 

  14. Pagh, R., Segev, G., Wieder, U.: How to approximate a set without knowing its size in advance. arXiv preprint arXiv:1304.1188 (2013)

    Google Scholar 

  15. Thorup, M.: Timeouts with time-reversed linear probing. In: INFOCOM, pp. 166–170 (2011)

    Google Scholar 

  16. Tarkoma, S., Rothenberg, C.E., Lagerspetz, E.: Theory and practice of bloom filters for distributed systems. IEEE Communications Surveys & Tutorials 14(1), 131–155 (2012)

    Google Scholar 

  17. Yoon, M.K.: Aging bloom filter with two active buffers for dynamic sets. IEEE Transactions on Knowledge and Data Engineering 22(1), 134–138 (2010)

    Google Scholar 

  18. Zhang, L., Guan, Y.: Detecting click fraud in pay-per-click streams of online advertising networks. In: ICDCS, pp. 77–84 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Naor, M., Yogev, E. (2013). Sliding Bloom Filters. In: Cai, L., Cheng, SW., Lam, TW. (eds) Algorithms and Computation. ISAAC 2013. Lecture Notes in Computer Science, vol 8283. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45030-3_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45030-3_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45029-7

  • Online ISBN: 978-3-642-45030-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics