Comparing Fuzzy-C Means and K-Means Clustering Techniques: A Comprehensive Study
Clustering techniques are unsupervised learning methods of grouping similar from dissimilar data types. Therefore, these are popular for various data mining and pattern recognition purposes. However, their performances are data dependent. Thus, choosing right clustering technique for a given dataset is a research challenge. In this paper, we have tested the performances of a Soft clustering (e.g., Fuzzy C means or FCM) and a Hard clustering technique (e.g., K-means or KM) on Iris (150 x 4); Wine (178 x 13) and Lens (24 x 4) datasets. Distance measure is the heart of any clustering algorithm to compute the similarity between any two data. Two distance measures such as Manhattan (MH) and Euclidean (ED) are used to note how these influence the overall clustering performance. The performance has been compared based on seven parameters: (i) sensitivity, (ii) specificity, (iii) precision, (iv) accuracy, (v) run time, (vi) average intra cluster distance (i.e. compactness of the clusters) and (vii) inter cluster distance (i.e. distinctiveness of the clusters). Based on the experimental results, the paper concludes that both KM and FCM have performed well. However, KM outperforms FCM in terms of speed. FCM-MH combination produces most compact clusters, while KM-ED yields most distinct clusters.
KeywordsClustering FCM KM Distance measures Performance test
Unable to display preview. Download preview PDF.
- 1.Bezdek, J.C.: Fuzzy mathematics in pattern classification. Applied Mathematics Centre, Cornell University, Ithaca. PhD thesis (1973)Google Scholar
- 4.MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, CA, pp. 281–297 (1967)Google Scholar
- 6.Chattopadhyay, S., Pratihar, D.K., De Sarkar, S.C.: A comparative study of fuzzy C-means algorithm and entropy-based fuzzy clustering algorithm. Computing and Informatics 30(4), 701–720 (2011)Google Scholar
- 7.http://archive.ics.uci.edu/ml/ (Online; last accessed on December 23, 2011)