Sublinear Methods for Detecting Periodic Trends in Data Streams

  • Funda Ergun
  • S. Muthukrishnan
  • S. Cenk Sahinalp
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2976)


We present sublinear algorithms — algorithms that use significantly less resources than needed to store or process the entire input stream – for discovering representative trends in data streams in the form of periodicities. Our algorithms involve sampling Õ\((\sqrt{n})\) positions. and thus they scan not the entire data stream but merely a sublinear sample thereof. Alternately, our algorithms may be thought of as working on streaming inputs where each data item is seen once, but we store only a sublinear – Õ\((\sqrt{n})\) – size sample from which we can identify periodicities. In this work we present a variety of definitions of periodicities of a given stream, present sublinear sampling algorithms for discovering them, and prove that the algorithms meet our specifications and guarantees. No previously known results can provide such guarantees for finding any such periodic trends. We also investigate the relationships between these different definitions of periodicity.


Data Stream Time Series Data Block Pair Periodic Trend Secondary Sample 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Batu, T., Ergun, F., Kilian, J., Magen, A., Raskhodnikova, S., Rubinfeld, R., Sami, R.: A sublinear algorithm for weakly approximating edit distance. In: STOC 2003, pp. 316–324 (2003)Google Scholar
  2. 2.
    Gilbert, A., Guha, S., Indyk, P., Muthukrishnan, S., Strauss, M.: Near-optimal sparse fourier representations via sampling. In: Proc. STOC 2002, pp. 152–161 (2002)Google Scholar
  3. 3.
    Goldreich, O., Goldwasser, S., Ron, D.: Property testing and its connection to learning and approximation. Journal of the ACM 45(4), 653–750 (1998)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Rubinfeld, R.: Talk on sublinear algorithms,
  5. 5.
    Rubinfeld, R., Sudan, M.: Robust Characterization of Polynomials with Applications to Program Testing. SIAM Journal of Computing 25(2), 252–271 (1996)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Indyk, P., Koudas, N., Muthukrishnan, S.: Identifying Representative Trends in Massive Time Series Data Sets Using Sketches. In: Proc. VLDB 2000, pp. 363–372 (2000)Google Scholar
  7. 7.
    Das, G., Gunopoulos, D.: Time Series Similarity Measures,
  8. 8.
  9. 9.
    Olken, F., Rotem, D.: Random sampling from databases: A Survey. Bibliography, at
  10. 10.
    Chaudhuri, S., Das, G., Datar, M., Motwani, R., Narasayya, V.: Overcoming Limitations of Sampling for Aggregation Queries. In: Proc. ICDE (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Funda Ergun
    • 1
  • S. Muthukrishnan
    • 2
  • S. Cenk Sahinalp
    • 1
  1. 1.Department of EECSCase Western Reserve University 
  2. 2.Department of Computer ScienceRutgers University 

Personalised recommendations