On Resources Optimization in Fuzzy Clustering of Data Streams

  • Maciej Jaworski
  • Lena Pietruczuk
  • Piotr Duda
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7268)


In this paper the resource consumption of the fuzzy clustering algorithms for data streams is studied. As the examples, the wFCM and the wPCM algorithms are examined. It is shown that partitioning a data stream into chunks reduces the processing time of considered algorithms significantly. The partitioning procedure is accompanied with the reduction of results accuracy, however the change is acceptable. The problems arised due to the high speed data streams are presented as well. The uncontrolable growth of subsequent data chunk sizes, which leads to the overflow of the available memory, is demonstrated for both the wFCM and wPCM algorithms. The maximum chunk size limit modification, as a solution to this problem, is introduced. This modification ensures that the available memory is never exceeded, what is shown in the simulations. The considered modification decreases the quality of clustering results only slightly.


Data Stream Fuzzy Cluster Resource Optimization Chunk Size Data Chunk 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, I., Krishnaswamy, S., Gaber, M.M.: Resource-Aware Ubiquitous Data Stream Querying. In: Proc. of the International Conference on Information and Automation, Colombo, Sri Lanka (2005)Google Scholar
  2. 2.
    Aggarwal, C.: Data Streams: Models and Algorithms. Springer, LLC (2007)zbMATHGoogle Scholar
  3. 3.
    Aggarwal, C., Han, J., Wang, J., Yu, P.S.: A Framework for Clustering Evolving Data Streams. In: Proc. of the 29th Conference on Very Large Data Bases, Berlin, Germany (2003)Google Scholar
  4. 4.
    Aggarwal, C., Han, J., Wang, J., Yu, P.S.: A Framework for Projected Clustering of High Diensional Data Streams. In: Proc. of the 30th Conference on Very Large Data Bases, Toronto, Canada (2004)Google Scholar
  5. 5.
    Babcock, B., Datar, M., Motwani, R.: Load Shedding for Aggregation Queries over Data Streams. In: Proc. of the 20th International Conference on Data Engineering, Boston, MA, USA (2004)Google Scholar
  6. 6.
    Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)zbMATHGoogle Scholar
  7. 7.
    Bifet, A.: Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams. IOS Press BV, Netherlands (2010)zbMATHGoogle Scholar
  8. 8.
    Chi, Y., Yu, P.S., Wang, H., Muntz, R.R.: Loadstar: A Load Shedding Scheme for Classifying Data Streams. In: Proc. of the SIAM International Conference on Data Mining, Newport Beach, CA, USA, (2005)Google Scholar
  9. 9.
    Dunn, J.C.: A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters. Cybernetics and Systems 3(3), 32–57 (1973)MathSciNetzbMATHCrossRefGoogle Scholar
  10. 10.
    Gaber, M.M., Krishnaswamy, S., Zaslavsky, A.: Adaptive Mining Techniques for Data Streams Using Algorithm Output Granularity. In: The Australasian Data Mining Workshop, Canberra, Australia (2003)Google Scholar
  11. 11.
    Gaber, M.M., Krishnaswamy, S., Zaslavsky, A.: On-board Mining of Data Streams in Sensor Networks. In: Badhyopadhyay, S., Maulik, U., Holder, L., Cook, D. (eds.) Advanced Methods of Knowledge Discovery from Complex Data. Springer (2005)Google Scholar
  12. 12.
    Gaber, M.M., Krishnaswamy, S., Zaslavsky, A.: Resource-aware Mining of Data Streams. Journal of Universal Computer Science 11(8), 1440–1453 (2005)Google Scholar
  13. 13.
    Gath, I., Geva, A.B.: Unsupervised Optimal Fuzzy Clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 11(7), 773–781 (1989)CrossRefGoogle Scholar
  14. 14.
    Guha, S., et al.: Clustering Data Streams: Theory and Practice. IEEE Transactions on Knowledge and Data Engineering 15(3), 515–528 (2003)CrossRefGoogle Scholar
  15. 15.
    Guha, S., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering Data Streams. In: Proc. of 41st Annual Symposium on Foundations of Computer Science, Redondo Beach, CA, USA (2000)Google Scholar
  16. 16.
    Hore, P., Hall, L.O., Goldgof, D.B.: Single Pass Fuzzy C Means. In: Proc. of the IEEE International Conference on Fuzzy Systems, London, July 23-26 (2007)Google Scholar
  17. 17.
    Khalilian, M., Mustapha, N.: Data Stream Clustering: Challenges and Issues. In: Proc. of the International Multiconference of Engineers and Computer Scientists, HongKong, vol. I (2010)Google Scholar
  18. 18.
    Krishnapuram, R., Keller, J.M.: A Possibilisic Approach to Clustering. IEEE Transactions on Fuzzy Systems 1(2), 98–110 (1993)CrossRefGoogle Scholar
  19. 19.
    Nowicki, R.: Nonlinear modelling and classification based on the MICOG defuzzifications. Journal of Nonlinear Analysis, Series A: Theory, Methods and Applications 7(12), 1033–1047 (2009)CrossRefGoogle Scholar
  20. 20.
    Rutkowski, L.: The real-time identification of time-varying systems by nonparametric algorithms based on the Parzen kernels. International Journal of Systems Science 16, 1123–1130 (1985)zbMATHCrossRefGoogle Scholar
  21. 21.
    Rutkowski, L.: Sequential pattern recognition procedures derived from multiple Fourier series. Pattern Recognition Letters 8, 213–216 (1988)zbMATHCrossRefGoogle Scholar
  22. 22.
    Rutkowski, L.: An application of multiple Fourier series to identification of multivariable nonstationary systems. International Journal of Systems Science 20(10), 1993–2002 (1989)MathSciNetzbMATHCrossRefGoogle Scholar
  23. 23.
    Rutkowski, L.: Nonparametric learning algorithms in the time-varying environments. Signal Processing 18, 129–137 (1989)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Rutkowski, L., Cpałka, K.: A general approach to neuro - fuzzy systems. In: Proceedings of the 10th IEEE International Conference on Fuzzy Systems, Melbourne, December 2-5, vol. 3, pp. 1428–1431 (2001)Google Scholar
  25. 25.
    Rutkowski, L., Cpałka, K.: A neuro-fuzzy controller with a compromise fuzzy reasoning. Control and Cybernetics 31(2), 297–308 (2002)zbMATHGoogle Scholar
  26. 26.
    Scherer, R.: Boosting Ensemble of Relational Neuro-fuzzy Systems. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 306–313. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  27. 27.
    Scherer, R.: Neuro-fuzzy Systems with Relation Matrix. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2010. LNCS (LNAI), vol. 6113, pp. 210–215. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  28. 28.
    Starczewski, J., Rutkowski, L.: Interval type 2 neuro-fuzzy systems based on interval consequents. In: Rutkowski, L., Kacprzyk, J. (eds.) Neural Networks and Soft Computing, pp. 570–577. Physica-Verlag, Springer-Verlag Company, Heidelberg, New York (2003)Google Scholar
  29. 29.
    Starczewski, J.T., Rutkowski, L.: Connectionist Structures of Type 2 Fuzzy Inference Systems. In: Wyrzykowski, R., Dongarra, J., Paprzycki, M., Waśniewski, J. (eds.) PPAM 2001. LNCS, vol. 2328, pp. 634–642. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  30. 30.
    Teng, W.G., Chen, M.S., Yu, P.S.: Resource-Aware Mining with Variable Granularities in Data Streams. In: Proc. of the Fourth SIAM International Conference on Data Mining, Lake Buena Vista, Florida (2004)Google Scholar
  31. 31.
    Vivekanandan, P., Nedunchezhian, R.: Mining Rules of Concept Drift Using Genetic Algorithm. Journal of Artificial Inteligence and Soft Computing Research 1(2), 135–145 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Maciej Jaworski
    • 1
  • Lena Pietruczuk
    • 1
  • Piotr Duda
    • 1
  1. 1.Department of Computer EngineeringCzestochowa University of TechnologyCzestochowaPoland

Personalised recommendations