Advertisement

Data Streams pp 127-147 | Cite as

Load Shedding in Data Stream Systems

  • Brian Babcock
  • Mayur Datar
  • Rajeev Motwani
Part of the Advances in Database Systems book series (ADBS, volume 31)

Abstract

Systems for processing continuous monitoring queries over data streams must be adaptive because data streams are often bursty and data characteristics may vary over time. In this chapter, we focus on one particular type of adaptivity: the ability to gracefully degrade performance via “load shedding” (dropping unprocessed tuples to reduce system load) when the demands placed on the system cannot be met in full given available resources. Focusing on aggregation queries, we present algorithms that determine at what points in a query plan should load shedding be performed and what amount of load should be shed at each point in order to minimize the degree of inaccuracy introduced into query answers. We also discuss strategies for load shedding for other types of queries (set-valued queries, join queries, and classification queries).

Keywords

data streams load shedding adaptive query processing sliding windows autonomic computing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    B. Babcock. Processing Continuous Queries over Streaming Data With Limited System Resources. PhD thesis, Stanford University, Department of Computer Science, 2005.Google Scholar
  2. [2]
    B. Babcock, M. Datar, and R. Motwani. Load shedding for aggregation queries over data streams. In Proceedings of the 2004 International Conference on Data Engineering, pages 350–361, March 2004.Google Scholar
  3. [3]
    D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, and S. Zdonik. Monitoring streams-a new class of data management applications. In Proc. 28th Intl. Conf. on Very Large Data Bases, August 2002.Google Scholar
  4. [4]
    Y. Chi, P. S. Yu, H. Wang, and R. R. Muntz. Loadstar: A load shedding scheme for classifying data streams. In Proceedings of the 2005 SIAM International Data Mining Conference, April 2005.Google Scholar
  5. [5]
    A. Das, J. Gehrke, and M. Riedwald. Approximate join processing over data streams. In Proceedings of the 2003 ACM SIGMOD International Conf. on Management of Data, pages 40–51, 2003.Google Scholar
  6. [6]
    W. Hoeffding. Probability inequalities for sums of bounded random variables. In Journal of the American Statistical Association, volume 58, pages 13–30, March 1963.zbMATHCrossRefMathSciNetGoogle Scholar
  7. [7]
    J. Kang, J. F. Naughton, and S. Viglas. Evaluating window joins over unbounded streams. In Proceedings of the 2003 International Conference on Data Engineering, March 2003.Google Scholar
  8. [8]
    R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma. Query processing, approximation, and resource management in a data stream management system. In Proc. First Biennial Conf. on Innovative Data Systems Research (CIDR), January 2003.Google Scholar
  9. [9]
    N. Tatbul, U. Cetintemel, S. Zdonik, M. Cherniack, and M. Stonebraker. Load shedding in a data stream manager. In Proceedings of the 2003 International Conference on Very Large Data Bases, pages 309–320, September 2003.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Brian Babcock
    • 1
  • Mayur Datar
    • 2
  • Rajeev Motwani
    • 1
  1. 1.Department of Computer ScienceStanford UniversityUSA
  2. 2.Google, Inc.USA

Personalised recommendations