Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Stream Mining

  • Jiawei Han
  • Bolin Ding
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_369

Synonyms

Stream data analysis

Definition

Stream mining is the process of discovering knowledge or patterns from continuous data streams. Unlike traditional data sets, data streams consist of sequences of data instances that flow in and out of a system continuously and with varying update rates. They are temporally ordered, fast changing, massive, and potentially infinite. Examples of data streams include data generated by communication networks, Internet traffic, online stock or business transactions, electric power grids, industry production processes, scientific and engineering experiments, and video, audio or remote sensing data from cameras, satellites, and sensor networks. Since it is usually impossible to store an entire data stream, or to scan through it multiple times due to its tremendous volume, most stream mining algorithms are confined to reading only once or a small number of times using limited computing and storage capabilities. Moreover, much of stream data resides at...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Aggarwal CC. Data streams: models and algorithms. Kluwer Academic; 2006.Google Scholar
  2. 2.
    Aggarwal CC, Han J, Wang J, Yu PS. A framework for clustering evolving data streams. In: Proceedings of the 29th International Conference on Very Large Data Bases; 2003. p. 81–92.CrossRefGoogle Scholar
  3. 3.
    Aggarwal CC, Han J, Wang J, Yu PS. On demand classification of data streams. In: Proceedings of the 10th ACM SIGKDD International Conference On Knowledge Discovery and Data Mining; 2004. p. 503–8.Google Scholar
  4. 4.
    Babcock B, Babu S, Datar M, Motwani R, Widom J. Models and issues in data stream systems. In: Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2002. p. 1–16.Google Scholar
  5. 5.
    Cai YD, Clutter D, Pape G, Han J, Welge M, Auvil L. MAIDS: mining alarming incidents from data streams. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2004. p. 919–20.Google Scholar
  6. 6.
    Chen Y, Dong G, Han J, Wah BW, Wang J. Multi-dimensional regression analysis of time-series data streams. In: Proceedings of the 28th International Conference on Very Large Data Bases; 2002. p. 323–34.CrossRefGoogle Scholar
  7. 7.
    Cormode G, Muthukrishnan S. What’s hot and what’s not: tracking most frequent items dynamically. In: Proceedings of the 22nd ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2003. p. 296–306.Google Scholar
  8. 8.
    Gao J, Fan W, Han J, Yu PS. A general framework for mining concept-drifting data streams with skewed distributions. In: Proceedings of the SIAM International Conference on Data Mining; 2007.Google Scholar
  9. 9.
    Guha S, Mishra N, Motwani R, O’Callaghan L. Clustering data streams. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science; 2000. p. 359–66.Google Scholar
  10. 10.
    Hulten G, Spencer L, Domingos P. Mining time-changing data streams. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2001.Google Scholar
  11. 11.
    Kargupta H, Bhargava B, Liu K, Powers M, Blair P, Bushra S, Dull J, Sarkar K, Klein M, Vasa M, Handy D. VEDAS: a mobile and distributed data stream mining system for real-time vehicle monitoring. In: Proceedings of the SIAM International Conference on Data Mining; 2004.Google Scholar
  12. 12.
    Manku G, Motwani R. Approximate frequency counts over data streams. In: Proceedings of the 28th International Conference on Very Large Data Bases; 2002. p. 346–57.CrossRefGoogle Scholar
  13. 13.
    Mendes L, Ding B, Han J. Stream sequential pattern mining with precise error bounds. In: Proceedings of the 2008 IEEE International Conference on Data Mining; 2008.Google Scholar
  14. 14.
    O’Callaghan L, Meyerson A, Motwani R, Mishra N, Guha S. Streaming-data algorithms for high-quality clustering. In: Proceedings of the 18th International Conference on Data Engineering; 2002. p. 685–96.Google Scholar
  15. 15.
    Shasha D, Zhu Y. High performance discovery in time series: techniques and case studies: Springer; 2004.Google Scholar
  16. 16.
    Wang H, Fan W, Yu PS, Han J. Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the 9th ACM SIGKDD International Conferenc on Knowledge Discovery and Data Mining; 2003. p. 226–35.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.University of Illinois at Urbana-ChampaignUrbanaUSA

Section editors and affiliations

  • Divesh Srivastava
    • 1
  1. 1.AT&T Labs - ResearchAT&TBedminsterUSA