Abstract
In recent years, advances in hardware technology have facilitated new ways of collecting data continuously. In many applications such as network monitoring, the volume of such data is so large that it may be impossible to store the data on disk. Furthermore, even when the data can be stored, the volume of the incoming data may be so large that it may be impossible to process any particular record more than once. Therefore, many data mining and database operations such as classification, clustering, frequent pattern mining and indexing become significantly more challenging in this context.
In many cases, the data patterns may evolve continuously, as a result of which it is necessary to design the mining algorithms effectively in order to account for changes in underlying structure of the data stream. This makes the solutions of the underlying problems even more difficult from an algorithmic and computational point of view. This book contains a number of chapters which are carefully chosen in order to discuss the broad research issues in data streams. The purpose of this chapter is to provide an overview of the organization of the stream processing and mining techniques which are covered in this book.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aggarwal C. (2003). A Framework for Diagnosing Changes in Evolving Data Streams. ACM SIGMOD Conference.
Aggarwal C (2002). An Intuitive Framework for understanding Changes in Evolving Data Streams. IEEE ICDE Conference.
Aggarwal C, Han J., Wang J., Yu P (2003). A Framework for Clustering Evolving Data Streams. VLDB Conference.
Aggarwal C, Han J., Wang J., Yu P (2004). A Framework for High Dimensional Projected Clustering of Data Streams. VLDB Conference.
Aggarwal C, Han J., Wang J., Yu P. (2004). On-Demand Classification of Data Streams. ACM KDD Conference.
Agrawal R., Imielinski T., Swami A. (1993) Mining Association Rules between Sets of items in Large Databases. ACM SIGMOD Conference.
Chen Y., Dong G., Han J., Wah B. W., Wang J. (2002) Multi-dimensional regression analysis of time-series data streams. VLDB Conference.
Cormode G., Garofalakis M. (2005) Sketching Streams Through the Net: Distributed Approximate Query Tracking. VLDB Conference.
Datar M., Gionis A., Indyk P., Motwani R. (2002) Maintaining stream statistics over sliding windows. SIAM Journal on Computing, 31(6): 1794–1813.
Dong G., Han J., Lam J., Pei J., Wang K. (2001) Mining multi-dimensional constrained gradients in data cubes. VLDB Conference.
Dasu T., Krishnan S., Venkatasubramaniam S., Yi K. (2005). An Information-Theoretic Approach to Detecting Changes in Multidimensional data Streams. Duke University Technical Report CS-2005-06.
Domingos P. and Hulten G. (2000) Mining High-Speed Data Streams. In Proceedings of the ACM KDD Conference.
Garofalakis M., Gehrke J., Rastogi R. (2002) Querying and mining data streams: you only get one look (a tutorial). SIGMOD Conference.
Guha S., Mishra N., Motwani R., O’Callaghan L. (2000). Clustering Data Streams. IEEE FOCS Conference.
Giannella C, Han J., Pei J., Yan X., and Yu P. (2002) Mining Frequent Patterns in Data Streams at Multiple Time Granularities. Proceedings of the NSF Workshop on Next Generation Data Mining.
Hulten G., Spencer L., Domingos P. (2001). Mining Time Changing Data Streams. ACM KDD Conference.
Jin R., Agrawal G. (2005) An algorithm for in-core frequent itemset mining on streaming data. ICDM Conference.
Kifer D., David S.-B., Gehrke J. (2004). Detecting Change in Data Streams. VLDB Conference, 2004.
Kollios G., Byers J., Considine J., Hadjielefttheriou M., Li F. (2005) Robust Aggregation in Sensor Networks. IEEE Data Engineering Bulletin.
Sakurai Y., Papadimitriou S., Faloutsos C. (2005). BRAID: Stream mining through group lag correlations. ACM SIGMOD Conference.
Yi B.-K., Sidiropoulos N.D., Johnson T., Jagadish, H. V., Faloutsos C, Biliris A. (2000). Online data mining for co-evolving time sequences. ICDE Conference.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Aggarwal, C.C. (2007). An Introduction to Data Streams. In: Aggarwal, C.C. (eds) Data Streams. Advances in Database Systems, vol 31. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-47534-9_1
Download citation
DOI: https://doi.org/10.1007/978-0-387-47534-9_1
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-28759-1
Online ISBN: 978-0-387-47534-9
eBook Packages: Computer ScienceComputer Science (R0)