On Fuzzy Clustering of Data Streams with Concept Drift

  • Maciej Jaworski
  • Piotr Duda
  • Lena Pietruczuk
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7268)

Abstract

In the paper the clustering algorithms based on fuzzy set theory are considered. Modifications of the Fuzzy C-Means and the Possibilistic C-Means algorithms are presented, which adjust them to deal with data streams. Since data stream is of infinite size, it has to be partitioned into chunks. Simulations show that this partitioning procedure does not affect the quality of clustering results significantly. Moreover, properly chosen weights can be assigned to each data element. This modification allows the presented algorithms to handle concept drift during simulations.

Keywords

Data Stream Cluster Center Fuzzy Cluster Concept Drift Data Chunk 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aggarwal, C.: Data Streams: Models and Algorithms. Springer, LLC (2007)MATHGoogle Scholar
  2. 2.
    Aggarwal, C., Han, J., Wang, J., Yu, P.S.: A Framework for Clustering Evolving Data Streams. In: Proc. of the 29th Conference on Very Large Data Bases, Berlin, Germany (2003)Google Scholar
  3. 3.
    Aggarwal, C., Han, J., Wang, J., Yu, P.S.: A Framework for Projected Clustering of High Diensional Data Streams. In: Proc. of the 30th Conference on Very Large Data Bases, Toronto, Canada (2003)Google Scholar
  4. 4.
    Babuska, R.: Fuzzy Modeling for Control. Kluwer Academic Press, Dordrecht (1998)CrossRefGoogle Scholar
  5. 5.
    Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)MATHGoogle Scholar
  6. 6.
    Bifet, A.: Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams. IOS Press BV, Netherlands (2010)MATHGoogle Scholar
  7. 7.
    Dunn, J.C.: A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters. Cybernetics and Systems 3(3), 32–57 (1973)MathSciNetMATHCrossRefGoogle Scholar
  8. 8.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proc. of 2nd International Confrence on Knowledge Discovery and Data Mining, pp. 226–231. AAAI Press (1996)Google Scholar
  9. 9.
    Fisher, D.H.: Knowledge Acquisition via Incremental Conceptual Clustering. Machine Learning 2(2), 139–172 (1987)Google Scholar
  10. 10.
    Gath, I., Geva, A.B.: Unsupervised Optimal Fuzzy Clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 11(7), 773–781 (1989)CrossRefGoogle Scholar
  11. 11.
    Guha, S., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering Data Streams. In: Proc. of 41st Annual Symposium on Foundations of Computer Science, Redondo Beach, CA, USA (2000)Google Scholar
  12. 12.
    Hore, P., Hall, L.O., Goldgof, D.B.: Single Pass Fuzzy C Means. In: Proc. of the IEEE International Conference on Fuzzy Systems, London, July 23-26 (2007)Google Scholar
  13. 13.
    Khalilian, M., Mustapha, N.: Data Stream Clustering: Challenges and Issues. In: Proc. of the International Multiconference of Engineers and Computer Scientists, HongKong, vol. I (2010)Google Scholar
  14. 14.
    Krishnapuram, R., Keller, J.M.: A Possibilisic Approach to Clustering. IEEE Transactions on Fuzzy Systems 1(2), 98–110 (1993)CrossRefGoogle Scholar
  15. 15.
    McQueen, J.B.: Some Methods for Classification and Analysis of Multivariate Observations. In: Proc. of 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)Google Scholar
  16. 16.
    Miyamoto, S., Ichihashi, H., Honda, K.: Algorithms for Fuzzy Clustering. Springer, Heidelberg (2008)MATHGoogle Scholar
  17. 17.
    Nowicki, R.: Nonlinear modelling and classification based on the MICOG defuzzifications. Journal of Nonlinear Analysis, Series A: Theory, Methods and Applications 7(12), 1033–1047 (2009)CrossRefGoogle Scholar
  18. 18.
    Rutkowski, L.: The real-time identification of time-varying systems by nonparametric algorithms based on the Parzen kernels. International Journal of Systems Science 16, 1123–1130 (1985)MATHCrossRefGoogle Scholar
  19. 19.
    Rutkowski, L.: Sequential pattern recognition procedures derived from multiple Fourier series. Pattern Recognition Letters 8, 213–216 (1988)MATHCrossRefGoogle Scholar
  20. 20.
    Rutkowski, L.: An application of multiple Fourier series to identification of multivariable nonstationary systems. International Journal of Systems Science 20(10), 1993–2002 (1989)MathSciNetMATHCrossRefGoogle Scholar
  21. 21.
    Rutkowski, L.: Nonparametric learning algorithms in the time-varying environments. Signal Processing 18, 129–137 (1989)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Rutkowski, L.: Computational Intelligence. Springer (2008)Google Scholar
  23. 23.
    Rutkowski, L., Cpałka, K.: A general approach to neuro - fuzzy systems. In: Proceedings of the 10th IEEE International Conference on Fuzzy Systems, Melbourne, December 2-5, vol. 3, pp. 1428–1431 (2001)Google Scholar
  24. 24.
    Rutkowski, L., Cpałka, K.: A neuro-fuzzy controller with a compromise fuzzy reasoning. Control and Cybernetics 31(2), 297–308 (2002)MATHGoogle Scholar
  25. 25.
    Scherer, R.: Boosting Ensemble of Relational Neuro-fuzzy Systems. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 306–313. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  26. 26.
    Scherer, R.: Neuro-fuzzy Systems with Relation Matrix. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2010. LNCS (LNAI), vol. 6113, pp. 210–215. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  27. 27.
    Starczewski, J., Rutkowski, L.: Interval type 2 neuro-fuzzy systems based on interval consequents. In: Rutkowski, L., Kacprzyk, J. (eds.) Neural Networks and Soft Computing, pp. 570–577. Physica-Verlag, Springer-Verlag Company, Heidelberg, New York (2003)Google Scholar
  28. 28.
    Starczewski, J.T., Rutkowski, L.: Connectionist Structures of Type 2 Fuzzy Inference Systems. In: Wyrzykowski, R., Dongarra, J., Paprzycki, M., Waśniewski, J. (eds.) PPAM 2001. LNCS, vol. 2328, pp. 634–642. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  29. 29.
    Vivekanandan, P., Nedunchezhian, R.: Mining Rules of Concept Drift Using Genetic Algorithm. Journal of Artificial Inteligence and Soft Computing Research 1(2), 135–145 (2011)Google Scholar
  30. 30.
    Xie, X.L., Beni, G.: A Validity Measure for Fuzzy Clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(4), 841–846 (1991)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Maciej Jaworski
    • 1
  • Piotr Duda
    • 1
  • Lena Pietruczuk
    • 1
  1. 1.Department of Computer EngineeringCzestochowa University of TechnologyCzestochowaPoland

Personalised recommendations