Abstract
In this paper the resource consumption of the fuzzy clustering algorithms for data streams is studied. As the examples, the wFCM and the wPCM algorithms are examined. It is shown that partitioning a data stream into chunks reduces the processing time of considered algorithms significantly. The partitioning procedure is accompanied with the reduction of results accuracy, however the change is acceptable. The problems arised due to the high speed data streams are presented as well. The uncontrolable growth of subsequent data chunk sizes, which leads to the overflow of the available memory, is demonstrated for both the wFCM and wPCM algorithms. The maximum chunk size limit modification, as a solution to this problem, is introduced. This modification ensures that the available memory is never exceeded, what is shown in the simulations. The considered modification decreases the quality of clustering results only slightly.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agarwal, I., Krishnaswamy, S., Gaber, M.M.: Resource-Aware Ubiquitous Data Stream Querying. In: Proc. of the International Conference on Information and Automation, Colombo, Sri Lanka (2005)
Aggarwal, C.: Data Streams: Models and Algorithms. Springer, LLC (2007)
Aggarwal, C., Han, J., Wang, J., Yu, P.S.: A Framework for Clustering Evolving Data Streams. In: Proc. of the 29th Conference on Very Large Data Bases, Berlin, Germany (2003)
Aggarwal, C., Han, J., Wang, J., Yu, P.S.: A Framework for Projected Clustering of High Diensional Data Streams. In: Proc. of the 30th Conference on Very Large Data Bases, Toronto, Canada (2004)
Babcock, B., Datar, M., Motwani, R.: Load Shedding for Aggregation Queries over Data Streams. In: Proc. of the 20th International Conference on Data Engineering, Boston, MA, USA (2004)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)
Bifet, A.: Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams. IOS Press BV, Netherlands (2010)
Chi, Y., Yu, P.S., Wang, H., Muntz, R.R.: Loadstar: A Load Shedding Scheme for Classifying Data Streams. In: Proc. of the SIAM International Conference on Data Mining, Newport Beach, CA, USA, (2005)
Dunn, J.C.: A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters. Cybernetics and Systems 3(3), 32–57 (1973)
Gaber, M.M., Krishnaswamy, S., Zaslavsky, A.: Adaptive Mining Techniques for Data Streams Using Algorithm Output Granularity. In: The Australasian Data Mining Workshop, Canberra, Australia (2003)
Gaber, M.M., Krishnaswamy, S., Zaslavsky, A.: On-board Mining of Data Streams in Sensor Networks. In: Badhyopadhyay, S., Maulik, U., Holder, L., Cook, D. (eds.) Advanced Methods of Knowledge Discovery from Complex Data. Springer (2005)
Gaber, M.M., Krishnaswamy, S., Zaslavsky, A.: Resource-aware Mining of Data Streams. Journal of Universal Computer Science 11(8), 1440–1453 (2005)
Gath, I., Geva, A.B.: Unsupervised Optimal Fuzzy Clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 11(7), 773–781 (1989)
Guha, S., et al.: Clustering Data Streams: Theory and Practice. IEEE Transactions on Knowledge and Data Engineering 15(3), 515–528 (2003)
Guha, S., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering Data Streams. In: Proc. of 41st Annual Symposium on Foundations of Computer Science, Redondo Beach, CA, USA (2000)
Hore, P., Hall, L.O., Goldgof, D.B.: Single Pass Fuzzy C Means. In: Proc. of the IEEE International Conference on Fuzzy Systems, London, July 23-26 (2007)
Khalilian, M., Mustapha, N.: Data Stream Clustering: Challenges and Issues. In: Proc. of the International Multiconference of Engineers and Computer Scientists, HongKong, vol. I (2010)
Krishnapuram, R., Keller, J.M.: A Possibilisic Approach to Clustering. IEEE Transactions on Fuzzy Systems 1(2), 98–110 (1993)
Nowicki, R.: Nonlinear modelling and classification based on the MICOG defuzzifications. Journal of Nonlinear Analysis, Series A: Theory, Methods and Applications 7(12), 1033–1047 (2009)
Rutkowski, L.: The real-time identification of time-varying systems by nonparametric algorithms based on the Parzen kernels. International Journal of Systems Science 16, 1123–1130 (1985)
Rutkowski, L.: Sequential pattern recognition procedures derived from multiple Fourier series. Pattern Recognition Letters 8, 213–216 (1988)
Rutkowski, L.: An application of multiple Fourier series to identification of multivariable nonstationary systems. International Journal of Systems Science 20(10), 1993–2002 (1989)
Rutkowski, L.: Nonparametric learning algorithms in the time-varying environments. Signal Processing 18, 129–137 (1989)
Rutkowski, L., Cpałka, K.: A general approach to neuro - fuzzy systems. In: Proceedings of the 10th IEEE International Conference on Fuzzy Systems, Melbourne, December 2-5, vol. 3, pp. 1428–1431 (2001)
Rutkowski, L., Cpałka, K.: A neuro-fuzzy controller with a compromise fuzzy reasoning. Control and Cybernetics 31(2), 297–308 (2002)
Scherer, R.: Boosting Ensemble of Relational Neuro-fuzzy Systems. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 306–313. Springer, Heidelberg (2006)
Scherer, R.: Neuro-fuzzy Systems with Relation Matrix. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2010. LNCS (LNAI), vol. 6113, pp. 210–215. Springer, Heidelberg (2010)
Starczewski, J., Rutkowski, L.: Interval type 2 neuro-fuzzy systems based on interval consequents. In: Rutkowski, L., Kacprzyk, J. (eds.) Neural Networks and Soft Computing, pp. 570–577. Physica-Verlag, Springer-Verlag Company, Heidelberg, New York (2003)
Starczewski, J.T., Rutkowski, L.: Connectionist Structures of Type 2 Fuzzy Inference Systems. In: Wyrzykowski, R., Dongarra, J., Paprzycki, M., Waśniewski, J. (eds.) PPAM 2001. LNCS, vol. 2328, pp. 634–642. Springer, Heidelberg (2002)
Teng, W.G., Chen, M.S., Yu, P.S.: Resource-Aware Mining with Variable Granularities in Data Streams. In: Proc. of the Fourth SIAM International Conference on Data Mining, Lake Buena Vista, Florida (2004)
Vivekanandan, P., Nedunchezhian, R.: Mining Rules of Concept Drift Using Genetic Algorithm. Journal of Artificial Inteligence and Soft Computing Research 1(2), 135–145 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jaworski, M., Pietruczuk, L., Duda, P. (2012). On Resources Optimization in Fuzzy Clustering of Data Streams. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2012. Lecture Notes in Computer Science(), vol 7268. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29350-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-29350-4_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29349-8
Online ISBN: 978-3-642-29350-4
eBook Packages: Computer ScienceComputer Science (R0)