Advertisement

An EM-Based Algorithm for Clustering Data Streams in Sliding Windows

  • Xuan Hong Dang
  • Vincent Lee
  • Wee Keong Ng
  • Arridhana Ciptadi
  • Kok Leong Ong
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5463)

Abstract

Cluster analysis has played a key role in data understanding. When such an important data mining task is extended to the context of data streams, it becomes more challenging since the data arrive at a mining system in one-pass manner. The problem is even more difficult when the clustering task is considered in a sliding window model which requiring the elimination of outdated data must be dealt with properly. We propose SWEM algorithm that exploits the Expectation Maximization technique to address these challenges. SWEM is not only able to process the stream in an incremental manner, but also capable to adapt to changes happened in the underlying stream distribution.

Keywords

Time Slot Data Stream Sliding Window Merging Oper Cluster Data Stream 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Aberer, K., Koubarakis, M., Kalogeraki, V. (eds.) VLDB 2003. LNCS, vol. 2944, pp. 81–92. Springer, Heidelberg (2004)Google Scholar
  2. 2.
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: PODS, pp. 1–16 (2002)Google Scholar
  3. 3.
    Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: Jonker, W., Petković, M. (eds.) SDM 2006. LNCS, vol. 4165. Springer, Heidelberg (2006)Google Scholar
  4. 4.
    Chen, Y., Tu, L.: Density-based clustering for real-time stream data. In: SIGKDD Conference, pp. 133–142 (2007)Google Scholar
  5. 5.
    Aoying, Z., Feng, C., Ying, Y., Chaofeng, S., Xiaofeng, H.: Distributed data stream clustering: A fast em-based approach. In: ICDE Conference, pp. 736–745 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Xuan Hong Dang
    • 1
  • Vincent Lee
    • 1
  • Wee Keong Ng
    • 2
  • Arridhana Ciptadi
    • 2
  • Kok Leong Ong
    • 3
  1. 1.Monash UniversityAustralia
  2. 2.Nanyang Technological UniversitySingapore
  3. 3.Deakin UniversityAustralia

Personalised recommendations