Skip to main content

The Sliding-Window Computation Model and Results

  • Chapter
  • First Online:
Data Stream Management

Part of the book series: Data-Centric Systems and Applications ((DCSA))

Abstract

We present some results related to small space computation over sliding windows in the data-stream model. Most research in the data-stream model, including results presented in some of the other chapters, assume that all data elements seen so far in the stream are equally important and synopses, statistics or models that are built should reflect the entire data set. However, for many applications this assumption is not true, particularly those that ascribe more importance to recent data items. One way to discount old data items and only consider recent ones for analysis is the sliding-window model: Data elements arrive at every instant; each data element expires after exactly N time steps; and, the portion of data that is relevant to gathering statistics or answering queries is the set of last N elements to arrive. The sliding window refers to the window of active data elements at a given time instant and window size refers to N. This chapter presents a general technique, called the Exponential Histogram (EH) technique, that can be used to solve a wide variety of problems in the sliding-window model; typically problems that require us to maintain statistics. We will showcase this technique through solutions to basic counting problems, as well as other applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. N. Alon, Y. Matias, M. Szegedy, The space complexity of approximating the frequency moments, in Proc. of the 1996 Annual ACM Symp. on Theory of Computing (1996), pp. 20–29

    Google Scholar 

  2. A. Arasu, G. Manku, Approximate counts and quantiles over sliding windows. Technical report, Stanford University, Stanford, California (2004)

    Google Scholar 

  3. B. Babcock, M. Datar, R. Motwani, Sampling from a moving window over streaming data, in Proc. of the 2002 Annual ACM-SIAM Symp. on Discrete Algorithms (2002), pp. 633–634

    Google Scholar 

  4. B. Babcock, M. Datar, R. Motwani, L. O’Callaghan, Maintaining variance and k-medians over data stream windows, in Proc. of the 2003 ACM Symp. on Principles of Database Systems (2003), pp. 234–243

    Google Scholar 

  5. E. Cohen, M. Strauss, Maintaining time-decaying stream aggregates, in Proc. of the 2003 ACM Symp. on Principles of Database Systems (2003), pp. 223–233

    Google Scholar 

  6. A. Das, J. Gehrke, M. Riedwald, Approximate join processing over data streams, in Proc. of the 2003 ACM SIGMOD Intl. Conf. on Management of Data (2003), pp. 40–51

    Chapter  Google Scholar 

  7. M. Datar, Algorithms for data stream systems. PhD thesis, Stanford University, Stanford, CA, USA (2003)

    Google Scholar 

  8. M. Datar, A. Gionis, P. Indyk, R. Motwani, Maintaining stream statistics over sliding windows. SIAM J. Comput. 31(6), 1794–1813 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  9. M. Datar, S. Muthukrishnan, Estimating rarity and similarity over data stream windows, in Proc. of the 2002 Annual European Symp. on Algorithms (2002), pp. 323–334

    Chapter  Google Scholar 

  10. J. Feigenbaum, S. Kannan, M. Strauss, M. Viswanathan, An approximate \(l_{1}\)-difference algorithm for massive data streams, in Proc. of the 1999 Annual IEEE Symp. on Foundations of Computer Science (1999), pp. 501–511

    Google Scholar 

  11. A. Gilbert, S. Guha, P. Indyk, Y. Kotidis, S. Muthukrishnan, M. Strauss, Fast, small-space algorithms for approximate histogram maintenance, in Proc. of the 2002 Annual ACM Symp. on Theory of Computing (2002)

    Google Scholar 

  12. A. Gilbert, Y. Kotidis, S. Muthukrishnan, M. Strauss, Surfing wavelets on streams: one-pass summaries for approximate aggregate queries, in Proc. of the 2001 Intl. Conf. on Very Large Data Bases (2001), pp. 79–88

    Google Scholar 

  13. M. Greenwald, S. Khanna, Space-efficient online computation of quantile summaries, in Proc. of the 2001 ACM SIGMOD Intl. Conf. on Management of Data (2001), pp. 58–66

    Chapter  Google Scholar 

  14. S. Guha, N. Mishra, R. Motwani, L. O’Callaghan, Clustering data streams, in Proc. of the 2000 Annual IEEE Symp. on Foundations of Computer Science (2000), pp. 359–366

    Google Scholar 

  15. P. Indyk, Stable distributions, pseudorandom generators, embeddings and data stream computation, in Proc. of the 2000 Annual IEEE Symp. on Foundations of Computer Science (2000), pp. 189–197

    Google Scholar 

  16. J. Kang, J.F. Naughton, S. Viglas, Evaluating window joins over unbounded streams, in Proc. of the 2003 Intl. Conf. on Data Engineering (2003)

    Google Scholar 

  17. X. Lin, H. Lu, J. Xu, J.X. Yu, Continuously maintaining quantile summaries of the most recent \(n\) elements over a data stream, in Proc. of the 2004 Intl. Conf. on Data Engineering (2004)

    Google Scholar 

  18. R. Motwani, P. Raghavan, Randomized Algorithms (Cambridge University Press, Cambridge, 1995)

    Book  MATH  Google Scholar 

  19. J.S. Vitter, Random sampling with a reservoir. ACM Trans. Math. Softw. 11(1), 37–57 (1985)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mayur Datar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Datar, M., Motwani, R. (2016). The Sliding-Window Computation Model and Results. In: Garofalakis, M., Gehrke, J., Rastogi, R. (eds) Data Stream Management. Data-Centric Systems and Applications. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28608-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-28608-0_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28607-3

  • Online ISBN: 978-3-540-28608-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics