Advertisement

Robustness of Change Detection Algorithms

  • Tamraparni Dasu
  • Shankar Krishnan
  • Gina Maria Pomann
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7014)

Abstract

Stream mining is a challenging problem that has attracted considerable attention in the last decade. As a result there are numerous algorithms for mining data streams, from summarizing and analyzing, to change and anomaly detection. However, most research focuses on proposing, adapting or improving algorithms and studying their computational performance. For a practitioner of stream mining, there is very little guidance on choosing a technology suited for a particular task or application.

In this paper, we address the practical aspect of choosing a suitable algorithm by drawing on the statistical properties of power and robustness. For the purpose of illustration, we focus on change detection algorithms (CDAs). We define an objective performance measure, streaming power, and use it to explore the robustness of three different algorithms. The measure is comparable for disparate algorithms, and provides a common framework for comparing and evaluating change detection algorithms on any data set in a meaningful fashion. We demonstrate on real world applications, and on synthetic data.

In addition, we present a repository of data streams for the community to test change detection algorithms for streaming data.

Keywords

Data Stream Change Point Reference Distribution Power Curve Kolmogorov Smirnov 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aggarwal, C.C.: A framework for diagnosing changes in evolving data streams. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 575–586 (2003)Google Scholar
  2. 2.
    Chakrabarti, S., Sarawagi, S., Dom, B.: Mining surprising patterns using temporal description length. In: Proceedings of 24rd International Conference on Very Large Databases, pp. 606–617 (1998)Google Scholar
  3. 3.
    Chawathe, S.S., Garcia-Molina, H.: Meaningful change detection in structured data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 26–37 (1997)Google Scholar
  4. 4.
    Cox, D.R., Hinkley, D.V.: Theoretical Statistics. Wiley, New York (1974)CrossRefzbMATHGoogle Scholar
  5. 5.
    Dasu, T., Krishnan, S., Lin, D., Venkatasubramanian, S., Yi, K.: Change (Detection) you can believe in: Finding distributional shifts in data streams. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 21–34. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  6. 6.
    Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman and Hall (1993)Google Scholar
  7. 7.
    Ganti, V., Gehrke, J., Ramakrishnan, R., Loh, W.-Y.: A framework for measuring differences in data characteristics, pp. 126–137 (1999)Google Scholar
  8. 8.
    Huber, P.J.: Robust Statistics. John Wiley, New York (1981)CrossRefzbMATHGoogle Scholar
  9. 9.
    Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: KDD, pp. 97–106 (2001)Google Scholar
  10. 10.
    Keogh, E., Lonardi, S., Chiu, B.Y.: Finding surprising patterns in a time series database in linear time and space. In: KDD, pp. 550–556 (2002)Google Scholar
  11. 11.
    Kifer, D., Ben-David, S., Gehrke, J.: Detecting changes in data streams. In: Proceedings of the 30th International Conference on Very Large Databases, pp. 180–191 (2004)Google Scholar
  12. 12.
    Kleinberg, J.: Bursty and hierarchical structure in streams. Data Mining and Knowledge Discovery 7(4), 373–397 (2003)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Song, X., Wu, M., Jermaine, C., Ranka, S.: Statistical change detection for multi-dimensional data. In: ACM SIGKDD 2007, pp. 667–676 (2007)Google Scholar
  14. 14.
    Zhu, Y., Shasha, D.: Efficient elastic burst detection in data streams. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 336–345 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Tamraparni Dasu
    • 1
  • Shankar Krishnan
    • 1
  • Gina Maria Pomann
    • 2
  1. 1.AT&T Labs - ResearchUSA
  2. 2.North Carolina State UniversityUSA

Personalised recommendations