Advertisement

Theory of Computing Systems

, Volume 37, Issue 3, pp 457–478 | Cite as

Distributed Streams Algorithms for Sliding Windows

  • Phillip B. GibbonsEmail author
  • Srikanta TirthapuraEmail author
Article

Abstract

Massive data sets often arise as physically distributed, parallel data streams, and it is important to estimate various aggregates and statistics on the union of these streams. This paper presents algorithms for estimating aggregate functions over a “sliding window” of the N most recent data items in one or more streams. Our results include: 1. For a single stream,we present the first ε-approximation scheme for the number of 1’s in a sliding window that is optimal in both worst case time and space. We also present the first ε-approximation scheme for the sum of integers in [0..R] in a sliding window that is optimal in both worst case time and space (assuming R is at most polynomial in N). Both algorithms are deterministic and use only logarithmic memory words. 2. In contrast, we show that any deterministic algorithm that estimates, to within a small constant relative error, the number of 1’s (or the sum of integers) in a sliding window on the union of distributed streams requires Ω(N) space. 3. We present the first (randomized) (ε, δ)-approximation scheme for the number of 1’s in a sliding window on the union of distributed streams that uses only logarithmic memory words. We also present the first (ε, δ)-approximation scheme for the number of distinct values in a sliding window on distributed streams that uses only logarithmic memory words. Our results are obtained using a novel family of synopsis data structures called waves.

Keywords

Data Stream Approximation Scheme Query Time Deterministic Algorithm Single Stream 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag 2004

Authors and Affiliations

  1. 1.Intel Research Pittsburgh, 417 South Craig StreetPittsburgh, PA 15213USA
  2. 2.Department of Electrical and Computer Engineering, Iowa State UniversityAmes, IA 50010USA

Personalised recommendations