Advertisement

Exclusive and Complete Clustering of Streams

  • Vasudha Bhatnagar
  • Sharanjit Kaur
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4653)

Abstract

Clustering for evolving data stream demands that the algorithm should be capable of adapting the discovered clustering model to the changes in data characteristics.

In this paper we propose an algorithm for exclusive and complete clustering of data streams. We explain the concept of completeness of a stream clustering algorithm and show that the proposed algorithm guarantees detection of cluster if one exists. The algorithm has an on-line component with constant order time complexity and hence delivers predictable performance for stream processing. The algorithm is capable of detecting outliers and change in data distribution. Clustering is done by growing dense regions in the data space, honouring recency constraint. The algorithm delivers complete description of clusters facilitating semantic interpretation.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aggarwal, C.C., Han, J., Yu, P.S: A Framework for Clustering Evolving Data Streams. In: VLDB conference, pp. 81–92 (2003)Google Scholar
  2. 2.
    Aggarwal, C.C., Han, J., Yu, P.S: Framework for Projected Clustering of High Dimensional Data Streams. In: VLDB conference. Canada, pp. 852–863 (2004)Google Scholar
  3. 3.
    Barbára, D.: Requirements of Clustering Data Streams. SIGKDD 3, 23–27 (2002)CrossRefGoogle Scholar
  4. 4.
    Cao, F., Ester, M., Qian, W., Zhou, A.: Density-Based Clustering over an Evolving Data Stream with Noise. In: SIAM, pp. 326–337 (2006)Google Scholar
  5. 5.
    Dong, G., Han, J., Lakshmanan, L.V.S., et al.: Online Mining of Changes from Data Streams: Research Problems and Preliminary Results. ACM SIGMOD (2003)Google Scholar
  6. 6.
    Orlowska, M.E., Sun, X., Li, X.: Can Exclusive Clustering on Streaming Data be Achieved? SIGKDD 8, 102–108 (2006)CrossRefGoogle Scholar
  7. 7.
    Maimon, O., et al.: Data Mining and Knowledge Discovery Handbook. Springer, Heidelberg (2004)Google Scholar
  8. 8.
    Lu, Y., Sun, Y., Xu, G., Liu, G.: A Grid-Based Clustering Algorithm for High-dimensional Data Streams. ADMA. China (2005)Google Scholar
  9. 9.
    Agrawal, R., et al.: Automatic Subspace Clustering of High Dimensional data for Data Mining application. In: ACM SIGMOD (1998)Google Scholar
  10. 10.
  11. 11.
    University of California at Irvine: UCI Machine Learning Repository, http://www.ics.uci.edu/~mlearn/MLSummary

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Vasudha Bhatnagar
    • 1
  • Sharanjit Kaur
    • 1
  1. 1.Department of Computer Science, University of Delhi, DelhiIndia

Personalised recommendations