Advertisement

Data Streams pp 149-167 | Cite as

The Sliding-Window Computation Model and Results

  • Mayur Datar
  • Rajeev Motwani
Part of the Advances in Database Systems book series (ADBS, volume 31)

Abstract

The sliding-window model of computation is motivated by the assumption that, in certain data-stream processing applications, recent data is more useful and pertinent than older data. In such cases, we would like to answer questions about the data only over the last N most recent data elements (N is a parameter). We formalize this model of computation and answer questions about how much space and computation time is required to solve certain problems under the sliding-window model.

Keywords

sliding-window exponential histograms space lower bounds 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. In Proc. of the 1996 Annual ACM Symp. on Theory of Computing, pages 20–29, 1996.Google Scholar
  2. [2]
    A. Arasu and G. Manku. Approximate counts and quantiles over sliding windows,. In Proc. of the 2004 ACM Symp. Principles of Database Systems, pages 286–296, June 2004.Google Scholar
  3. [3]
    B. Babcock, M. Datar, and R. Motwani. Sampling from a moving window over streaming data. In Proc. of the 2002 Annual ACM-SIAM Symp. on Discrete Algorithms, pages 633–634, 2002.Google Scholar
  4. [4]
    B. Babcock, M. Datar, R. Motwani, and L. O’Callaghan. Maintaining variance and k-medians over data stream windows. In Proc. of the 2003 ACM Symp. on Principles of Database Systems, pages 234–243, June 2003.Google Scholar
  5. [5]
    E. Cohen and M. Strauss. Maintaining time-decaying stream aggregates. In Proc. of the 2003 ACM Symp. on Principles of Database Systems, pages 223–233, June 2003.Google Scholar
  6. [6]
    A. Das, J. Gehrke, and M. Riedwald. Approximate join processing over data streams. In Proc. of the 2003 ACM SIGMOD Intl. Conf. on Management of Data, pages 40–51, 2003.Google Scholar
  7. [7]
    M. Datar. Algorithms for Data Stream Systems. PhD thesis, Stanford University, Stanford, CA, USA, December 2003.Google Scholar
  8. [8]
    M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. SIAM Journal on Computing, 31(6): 1794–1813, 2002.zbMATHCrossRefMathSciNetGoogle Scholar
  9. [9]
    M. Datar and S. Muthukrishnan. Estimating rarity and similarity over data stream windows. In Proc. of the 2002 Annual European Symp. on Algorithms, pages 323–334, September 2002.Google Scholar
  10. [10]
    J. Feigenbaum, S. Kannan, M. Strauss, and M. Viswanathan. An approximate 11-difference algorithm for massive data streams. In Proc. of the 1999 Annual IEEE Symp. on Foundations of Computer Science, pages 501–511, 1999.Google Scholar
  11. [11]
    A. Gilbert, S. Guha, P. Indyk, Y. Kotidis, S. Muthukrishnan, and M. Strauss. Fast, small-space algorithms for approximate histogram maintenance. In Proc. of the 2002 Annual ACM Symp. on Theory of Computing, 2002.Google Scholar
  12. [12]
    A. Gilbert, Y. Kotidis, S. Muthukrishnan, and M. Strauss. Surfing wavelets on streams: One-pass summaries for approximate aggregate queries. In Proc. of the 2001 Intl. Conf. on Very Large Data Bases, pages 79–88, 2001.Google Scholar
  13. [13]
    M. Greenwald and S. Khanna. Space-efficient online computation of quantile summaries. In Proc. of the 2001 ACM SIGMOD Intl. Conf. on Management of Data, pages 58–66, 2001.Google Scholar
  14. [14]
    S. Guha, N. Mishra, R. Motwani, and L. O’Callaghan. Clustering data streams. In Proc. of the 2000 Annual IEEE Symp. on Foundations of Computer Science, pages 359–366, November 2000.Google Scholar
  15. [15]
    P. Indyk. Stable distributions, pseudorandom generators, embeddings and data stream computation. In Proc. of the 2000 Annual IEEE Symp. on Foundations of Computer Science, pages 189–197, 2000.Google Scholar
  16. [16]
    J. Kang, J. F. Naughton, and S. Viglas. Evaluating window joins over unbounded streams. In Proc. of the 2003 Intl. Conf. on Data Engineering, March 2003.Google Scholar
  17. [17]
    X. Lin, H. Lu, J. Xu, and J. X. Yu. Continuously maintaining quantile summaries of the most recent n elements over a data stream. In Proc. of the 2004 Intl. Conf. on Data Engineering, March 2004.Google Scholar
  18. [18]
    R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 1995.Google Scholar
  19. [19]
    J.S. Vitter. Random sampling with a reservoir. ACM Trans. on Mathematical Software, 11(1):37–57, 1985.zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Mayur Datar
    • 1
  • Rajeev Motwani
    • 2
  1. 1.Google, Inc.USA
  2. 2.Department of Computer ScienceStanford UniversityUSA

Personalised recommendations