Advertisement

Connectivity Based Stream Clustering Using Localised Density Exemplars

  • Sebastian Lühr
  • Mihai Lazarescu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5012)

Abstract

Advances in data acquisition have allowed large data collections of millions of time varying records in the form of data streams. The challenge is to effectively process the stream data with limited resources while maintaining sufficient historical information to define the changes and patterns over time. This paper describes an evidence-based approach that uses representative points to incrementally process stream data by using a graph based method to cluster points based on connectivity and density. Critical cluster features are archived in repositories to allow the algorithm to cope with recurrent information and to provide a rich history of relevant cluster changes if analysis of past data is required. We demonstrate our work with both synthetic and real world data sets.

Keywords

Execution Time Data Stream Very Large Scale Integration Sparse Graph Binary Search Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aggarwal, C.C., Han, J., Wang, J., Yu, P.: A framework for clustering evolving data streams. In: Proc. 29th Int’l Conf. Very Large Data Bases (2003)Google Scholar
  2. 2.
    Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for projected clustering of high dimensional data streams. In: Proc. Very Large Data Bases, pp. 852–863 (2004)Google Scholar
  3. 3.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. 2nd Int’l Conf. Knowledge Discovery and Data Mining, pp. 226–231 (1996)Google Scholar
  4. 4.
    Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: Proc. Sixth SIAM Int’l Conf. Data Mining (2006)Google Scholar
  5. 5.
    Ester, M., Kriegel, H.P., Sander, J., Wimmer, M., Xu, X.: Incremental clustering for mining in a data warehousing environment. In: Proc. 24rd Int’l Conf. Very Large Data Bases, pp. 323–333 (1998)Google Scholar
  6. 6.
    Karypis, G., Han, E.H., Kumar, V.: Chameleon: Hierarchical clustering using dynamic modeling. Computer 32(8), 68–75 (1999)CrossRefGoogle Scholar
  7. 7.
    Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S.: Mutlilevel hypergraph partitioning: Application in VLSI domain. IEEE Trans. Very Large Scale Integration (VLSI) Systems 7(1), 69–79 (1999)CrossRefGoogle Scholar
  8. 8.
    Knuth, D.: The Art of Computer Programming, 3rd edn., vol. 3 (1997)Google Scholar
  9. 9.
    Blackard, J.A., Dean, D.J.: Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Computers and Electronics in Agriculture 24(3), 131–151 (1999)CrossRefGoogle Scholar
  10. 10.
    Aggarwal, C.C.: A human-computer interactive method for projected clustering. IEEE Trans. Knowledge and Data Engineering 16(4), 448–460 (2004)CrossRefGoogle Scholar
  11. 11.
    Bentley, J.L.: Mutlidimensional binary search trees used for associative searching. Communications of the ACM 18(9), 509–517 (1975)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Sebastian Lühr
    • 1
  • Mihai Lazarescu
    • 1
  1. 1.Department of ComputingCurtin University of TechnologyBentley 

Personalised recommendations