Advertisement

Clustering Over Data Streams Based on Growing Neural Gas

  • Mohammed GhesmouneEmail author
  • Mustapha Lebbah
  • Hanene Azzag
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9078)

Abstract

Clustering data streams requires a process capable of partitioning observations continuously with restrictions of memory and time. In this paper we present a new algorithm, called G-Stream, for clustering data streams by making one pass over the data. G-Stream is based on growing neural gas, that allows us to discover clusters of arbitrary shape without any assumptions on the number of clusters. By using a reservoir, and applying a fading function, the quality of clustering is improved. The performance of the proposed algorithm is evaluated on public data sets.

Keywords

Data stream clustering Topological structure GNG 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ackermann, M.R., Märtens, M., Raupach, C., Swierkot, K., Lammersen, C., Sohler, C.: StreamKM++: A clustering algorithm for data streams. ACM Journal of Experimental Algorithmics, 17(1) (2012)Google Scholar
  2. 2.
    Aggarwal, C.C., Watson, T.J., Ctr, R., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: VLDB, pp. 81–92 (2003)Google Scholar
  3. 3.
    de Andrade Silva, J., Faria, E.R., Barros, R.C., Hruschka, E.R., de Carvalho, A.C.P.L.F., Gama, J.: Data stream clustering: A survey. ACM Comput. Surv. 46(1), 13 (2013)Google Scholar
  4. 4.
    Bache, K., Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
  5. 5.
    Bolanos, M., Forrest, J., Hahsler, M.: Stream: Infrastructure for Data Stream Mining (2014). http://CRAN.R-project.org/package=stream, r package version 0.2-0
  6. 6.
    Bouguelia, M.R., Belaïd, Y., Belaïd, A.: An adaptive incremental clustering method based on the growing neural gas algorithm. In: ICPRAM, pp. 42–49 (2013)Google Scholar
  7. 7.
    Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: SDM, pp. 328–339 (2006)Google Scholar
  8. 8.
    Fritzke, B.: A growing neural gas network learns topologies. In: NIPS, pp. 625–632 (1994)Google Scholar
  9. 9.
    Guha, S., Meyerson, A., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering data streams: Theory and practice. IEEE Transactions on Knowledge and Data Engineering 15(3), 515–528 (2003)CrossRefGoogle Scholar
  10. 10.
    Isaksson, C., Dunham, M.H., Hahsler, M.: SOStream: Self Organizing Density-Based Clustering over Data Stream. In: Perner, P. (ed.) MLDM 2012. LNCS, vol. 7376, pp. 264–278. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  11. 11.
    Kohonen, T., Schroeder, M.R., Huang, T.S. (eds.): Self-Organizing Maps, 3rd edn. Springer, Secaucus (2001) zbMATHGoogle Scholar
  12. 12.
    Kranen, P., Assent, I., Baldauf, C., Seidl, T.: The ClusTree: indexing micro-clusters for anytime stream mining. Knowledge and Information Systems 29(2), 249–272 (2011)CrossRefGoogle Scholar
  13. 13.
    Martinetz, T., Schulten, K.: A “Neural-Gas” Network Learns Topologies. Artificial Neural Networks I, 397–402 (1991)Google Scholar
  14. 14.
    Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)MathSciNetGoogle Scholar
  15. 15.
    Udommanetanakit, K., Rakthanmanon, T., Waiyamai, K.: E-Stream: Evolution-Based Technique for Stream Clustering. In: Alhajj, R., Gao, H., Li, X., Li, J., Zaïane, O.R. (eds.) ADMA 2007. LNCS (LNAI), vol. 4632, pp. 605–615. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  16. 16.
    Wang, C., Lai, J., Huang, D., Zheng, W.: SVStream: A support vector-based algorithm for clustering data streams. IEEE Trans. Knowl. Data Eng. 25(6), 1410–1424 (2013). http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.263 CrossRefGoogle Scholar
  17. 17.
    Zhang, T., Ramakrishnan, R., Livny, M.: Birch: An efficient data clustering method for very large databases. In: SIGMOD Conference, pp. 103–114 (1996)Google Scholar
  18. 18.
    Zhang, X., Furtlehner, C., Sebag, M.: Data streaming with affinity propagation. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 628–643. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  19. 19.
    Zhu, X.H.: Stream data mining repository (web site) (2010). http://www.cse.fau.edu/xqzhu/stream.html

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Mohammed Ghesmoune
    • 1
    Email author
  • Mustapha Lebbah
    • 1
  • Hanene Azzag
    • 1
  1. 1.University of Paris 13, Sorbonne Paris City LIPN-UMR 7030 - CNRSVilletaneuseFrance

Personalised recommendations