Skip to main content

An Introduction to Data Streams

  • Chapter
Data Streams

Part of the book series: Advances in Database Systems ((ADBS,volume 31))

Abstract

In recent years, advances in hardware technology have facilitated new ways of collecting data continuously. In many applications such as network monitoring, the volume of such data is so large that it may be impossible to store the data on disk. Furthermore, even when the data can be stored, the volume of the incoming data may be so large that it may be impossible to process any particular record more than once. Therefore, many data mining and database operations such as classification, clustering, frequent pattern mining and indexing become significantly more challenging in this context.

In many cases, the data patterns may evolve continuously, as a result of which it is necessary to design the mining algorithms effectively in order to account for changes in underlying structure of the data stream. This makes the solutions of the underlying problems even more difficult from an algorithmic and computational point of view. This book contains a number of chapters which are carefully chosen in order to discuss the broad research issues in data streams. The purpose of this chapter is to provide an overview of the organization of the stream processing and mining techniques which are covered in this book.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Aggarwal C. (2003). A Framework for Diagnosing Changes in Evolving Data Streams. ACM SIGMOD Conference.

    Google Scholar 

  2. Aggarwal C (2002). An Intuitive Framework for understanding Changes in Evolving Data Streams. IEEE ICDE Conference.

    Google Scholar 

  3. Aggarwal C, Han J., Wang J., Yu P (2003). A Framework for Clustering Evolving Data Streams. VLDB Conference.

    Google Scholar 

  4. Aggarwal C, Han J., Wang J., Yu P (2004). A Framework for High Dimensional Projected Clustering of Data Streams. VLDB Conference.

    Google Scholar 

  5. Aggarwal C, Han J., Wang J., Yu P. (2004). On-Demand Classification of Data Streams. ACM KDD Conference.

    Google Scholar 

  6. Agrawal R., Imielinski T., Swami A. (1993) Mining Association Rules between Sets of items in Large Databases. ACM SIGMOD Conference.

    Google Scholar 

  7. Chen Y., Dong G., Han J., Wah B. W., Wang J. (2002) Multi-dimensional regression analysis of time-series data streams. VLDB Conference.

    Google Scholar 

  8. Cormode G., Garofalakis M. (2005) Sketching Streams Through the Net: Distributed Approximate Query Tracking. VLDB Conference.

    Google Scholar 

  9. Datar M., Gionis A., Indyk P., Motwani R. (2002) Maintaining stream statistics over sliding windows. SIAM Journal on Computing, 31(6): 1794–1813.

    Article  MATH  MathSciNet  Google Scholar 

  10. Dong G., Han J., Lam J., Pei J., Wang K. (2001) Mining multi-dimensional constrained gradients in data cubes. VLDB Conference.

    Google Scholar 

  11. Dasu T., Krishnan S., Venkatasubramaniam S., Yi K. (2005). An Information-Theoretic Approach to Detecting Changes in Multidimensional data Streams. Duke University Technical Report CS-2005-06.

    Google Scholar 

  12. Domingos P. and Hulten G. (2000) Mining High-Speed Data Streams. In Proceedings of the ACM KDD Conference.

    Google Scholar 

  13. Garofalakis M., Gehrke J., Rastogi R. (2002) Querying and mining data streams: you only get one look (a tutorial). SIGMOD Conference.

    Google Scholar 

  14. Guha S., Mishra N., Motwani R., O’Callaghan L. (2000). Clustering Data Streams. IEEE FOCS Conference.

    Google Scholar 

  15. Giannella C, Han J., Pei J., Yan X., and Yu P. (2002) Mining Frequent Patterns in Data Streams at Multiple Time Granularities. Proceedings of the NSF Workshop on Next Generation Data Mining.

    Google Scholar 

  16. Hulten G., Spencer L., Domingos P. (2001). Mining Time Changing Data Streams. ACM KDD Conference.

    Google Scholar 

  17. Jin R., Agrawal G. (2005) An algorithm for in-core frequent itemset mining on streaming data. ICDM Conference.

    Google Scholar 

  18. Kifer D., David S.-B., Gehrke J. (2004). Detecting Change in Data Streams. VLDB Conference, 2004.

    Google Scholar 

  19. Kollios G., Byers J., Considine J., Hadjielefttheriou M., Li F. (2005) Robust Aggregation in Sensor Networks. IEEE Data Engineering Bulletin.

    Google Scholar 

  20. Sakurai Y., Papadimitriou S., Faloutsos C. (2005). BRAID: Stream mining through group lag correlations. ACM SIGMOD Conference.

    Google Scholar 

  21. Yi B.-K., Sidiropoulos N.D., Johnson T., Jagadish, H. V., Faloutsos C, Biliris A. (2000). Online data mining for co-evolving time sequences. ICDE Conference.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Aggarwal, C.C. (2007). An Introduction to Data Streams. In: Aggarwal, C.C. (eds) Data Streams. Advances in Database Systems, vol 31. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-47534-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-47534-9_1

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-28759-1

  • Online ISBN: 978-0-387-47534-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics