ADMA 2011: Advanced Data Mining and Applications pp 27-40 | Cite as
HUE-Stream: Evolution-Based Clustering Technique for Heterogeneous Data Streams with Uncertainty
Abstract
Evolution-based stream clustering method supports the monitoring and the change detection of clustering structures. E-Stream is an evolution-based stream clustering method that supports different types of clustering structure evolution which are appearance, disappearance, self-evolution, merge and split. This paper presents HUE-Stream which extends E-Stream in order to support uncertainty in heterogeneous data. A distance function, cluster representation and histogram management are introduced for the different types of clustering structure evolution. We evaluate effectiveness of HUE-Stream on real-world dataset KDDCup 1999 Network Intruision Detection. Experimental results show that HUE-Stream gives better cluster quality compared with UMicro.
Keywords
Uncertain data streams Heterogeneous data Clustering Evolutionbased clusteringPreview
Unable to display preview. Download preview PDF.
References
- 1.Aggarwal, C.C.: On High Dimensional Projected Clustering of Uncertain Data Streams. In: IEEE 25th International Conference on Data Engineering, ICDE 2009, March 29 - April 2, pp. 1152–1154 (2009)Google Scholar
- 2.Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. Paper presented at the Proceedings of the 29th International Conference on Very Large Data Bases, Berlin, Germany, vol. 29 (2003)Google Scholar
- 3.Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for projected clustering of high dimensional data streams. Paper presented at the Proceedings of the Thirtieth International Conference on Very Large Data Bases, Toronto, Canada, vol. 30 (2004)Google Scholar
- 4.Aggarwal, C.C., Yu, P.S.: A Framework for Clustering Uncertain Data Streams. In: IEEE 24th International Conference on Data Engineering, ICDE 2008, April 7-12, pp. 150–159 (2008)Google Scholar
- 5.Chen, Z., Ming, G., Aoying, Z.: Tracking High Quality Clusters over Uncertain Data Streams. In: IEEE 25th International Conference on Data Engineering, ICDE 2009, March 29 - April 2, pp. 1641–1648 (2009)Google Scholar
- 6.Qin, B., Xia, Y., Prabhakar, S., Tu, Y.: A Rule-Based Classification Algorithm for Uncertain Data. Paper presented at the Proceedings of the 2009 IEEE International Conference on Data Engineering (2009)Google Scholar
- 7.Udommanetanakit, K., Rakthanmanon, T., Waiyamai, K.: E-Stream: Evolution-Based Technique for Stream Clustering. In: Alhajj, R., Gao, H., Li, X., Li, J., Zaïane, O.R. (eds.) ADMA 2007. LNCS (LNAI), vol. 4632, pp. 605–615. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 8.Kosonpothisakun, P., Kangkachit, T., Waiyamai, K.: E-Stream++: Stream clustering technique for supporting numerical and categorical data. Paper presented at the Proceedings of the 13th National Computer Science and Engineering Conference, Bangkok, Thailand (2009)Google Scholar
- 9.Yang, C., Zhou, J.: HClustream: A Novel Approach for Clustering Evolving Heterogeneous Data Stream. Paper presented at the Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops (2006)Google Scholar
- 10.Huang, G.Y., Liang, D.P., Hu, C.Z., Ren, J.D.: An algorithm for clustering heterogeneous data streams with uncertainty. Paper presented at the Proceedings of International Conference on Machine Learning and Computing, Qingdao, China (2010)Google Scholar
- 11.The network intrusion detection data set, http://archive.ics.uci.edu/ml/datasets/