Abstract
Fuzzy c-means (FCM) algorithm is an important clustering method in pattern recognition, while the fuzziness parameter, m, in FCM algorithm is a key parameter that can significantly affect the result of clustering. Cluster validity index (CVI) is a kind of criterion function to validate the clustering results, thereby determining the optimal cluster number of a data set. From the perspective of cluster validation, we propose a novel method to select the optimal value of m in FCM, and four well-known CVIs, namely XB, VK, VT, and SC, for fuzzy clustering are used. In this method, the optimal value of m is determined when CVIs reach their minimum values. Experimental results on four synthetic data sets and four real data sets have demonstrated that the range of m is [2, 3.5] and the optimal interval is [2.5, 3].
Similar content being viewed by others
References
Hartigan J A. Clustering Algorithms. New York: Wiley, 1975
Yue S H, Wu T, Cui L J, et al. Clustering mechanism for electric tomography imaging. Sci China Inf Sci, 2012, 55: 2849–2864
Jain A K. Data clustering: 50 years beyond k-means. Pattern Recogn Lett, 2010, 31: 651–666
Xu R, Wunsch II D. Survey of clustering algorithms. IEEE Trans Neural Networ, 2005, 16: 645–678
Hu C X, Liu Y M, Li G, et al. Improved FOCUSS method for reconstruction of cluster structured sparse signals in radar imaging. Sci China Ser F-Inf Sci, 2012, 55: 1776–1788
Ni W W, Chong Z H. Clustering-oriented privacy-preserving data publishing. Knowl-based Syst, 2012, 35: 264–270
Dunn C. A fuzzy relative of the ISODATA process and its use in detecting compact, well-separated clusters. J Cybern, 1974, 3: 32–57
Bezdek J C. Pattern recognition with fuzzy objective function algorithms. New York: Plenum Press, 1981
Pal N R, Bezdek J C. On cluster validity for the fuzzy c-mean model. IEEE Trans Fuzzy Syst, 1995, 3: 370–379
Hall L L, Bensaid A M, Clarke L P. A comparison of neural network and fuzzy clustering techniques in segmenting magnetic resonance images of the brain. IEEE Trans Neural Networ, 2002, 3: 672–682
Cannon R L, Dave J V, Bezdek J C. Efficient implementation of the fuzzy c-means clustering algorithms. IEEE Trans Pattern Anal, 1986, PAMI-8: 248–255
Shen Y, Shi H, Zhang J Q. Improvement and optimization of a fuzzy c-means clustering algorithm. In: Proceedings of the 18th Instrumentation and Measurement Technology Conference, IEEE Computer Society 2001. 1430–1433
Bezdek J C. A physical interpretation of fuzzy ISODATA. IEEE Trans Syst Man Cy B, 1976, SMC-6: 387–390
Bezdek J C, Hathaway R. Convergence theory for fuzzy c-means: Counterexamples and repairs. IEEE Trans Syst Man Cy B, 1987, 17: 873–877
Chan K P, Cheung Y S. Clustering of clusters. Pattern Recogn, 1992, 25: 211–217
Choe H, Jordan J B. On the optimal choice of parameters in a fuzzy c-means algorithm. In: Proceedings of IEEE International Conference on Fuzzy Systems, IEEE Computer Society 1992. 349–354
Ozkan I, Turksen I B, Entropy assessment for type-2 fuzziness. In: Proceedings of IEEE International Conference on Fuzzy Systems, IEEE Computer Society 2004. 1111–1115
Ozkan I, Turksen I B. Upper and lower values for the level of fuzziness in FCM. Inform Sci, 2007, 177: 5143–5152
Wu K L. Analysis of parameter selections for fuzzy C-means. Pattern Recogn, 2012, 45: 407–415
Huang M, Xia Z, Wang H, et al. The range of the value for the fuzzifier of the fuzzy c-means algorithm. Pattern Recogn Lett, 2012, 33: 2280–2284
Hwang C, Rhee F C H. Uncertain fuzzy clustering: Interval type-2 fuzzy approach to c-means. IEEE Trans Fuzzy Syst, 2007, 15: 107–120
Yu J. On the fuzziness index of the FCM algorithms. Chin J Comput, 2003, 26: 965–973
Yu J, Cheng Q, Huang H. Analysis of the weighting exponent in the FCM. IEEE Trans Syst Man Cy B, 2004, 34: 634–639
Fadili M J, Ruan S, Bloyet D, et al. On the number of clusters and the fuzziness index for unsupervised FCA application to BOLD fMRI time series. Med Image Anal, 2001, 5: 55–67
Devijver P A, Kittler J. Pattern Recognition: A Statistical Approach. London: Prentice-Hall, 1982
Hoppner F, Klawon F, Kruse R, et al. Fuzzy Cluster Analysis: Methods for Classifications Data Analysis and Image Recognition. New York: Wiley, 1999
Kim M, Ramakrishna R S. New indices for cluster validity assessment. Pattern Recogn Lett, 2005, 26: 2353–2363
Wang W, Zhang Y. On fuzzy cluster validity indices. Fuzzy Set Syst, 2007, 158: 2095–2117
Xie X L, Beni G, A validity measure for fuzzy clustering. IEEE Trans Pattern Anal, 1991, 13: 841–847
Kwon S H. Cluster validity index for fuzzy clustering. Electron Lett, 1998, 34: 2176–2177
Tang Y, Sun F, Sun Z. Improved validation index for fuzzy clustering. In: Proceedings of the 2005 American Control Conference, IEEE Computer Society, 2005. 1120–1125
Bensaid A M, Hall L O, Bezdek J C, et al. Validity-guided (Re)clustering with applications to image segmentation. IEEE Trans Fuzzy Syst, 1996, 4: 112–123
Bezdek J C, Ehrlish R, Full W. FCM: The fuzzy c-means clustering algorithm. Comput Geosci-UK, 1984, 10: 191–203
Frank A, Asuncion A. UCI Machine Learning Repository. California: University of California, 2010
Zhang Y J, Wang W N, Zhang X N, et al. A cluster validity index for fuzzy clustering. Info Sci, 2008, 178: 1205–1218
Jegatha Deborah L, Baskaran R, Kannan A. A survey on internal validity measure for cluster validation. Int J Computer Sci Eng Surv, 2010, 1: 85–102
Guerra L, Robles V, Bielza C. et al. A comparison of clustering quality indices using outliers and noise. Intell Data Anal, 2012, 16: 703–715
Arbelaitz O, Gurrutxaga I, Muguerza J, et al. An extensive comparative study of cluster validity indices. Pattern Recogn, 2013, 46: 243–256
Gurrutxaga I, Muguerza J, Arbelaitz O, et al. Towards a standard methodology to evaluate internal cluster validity indices. Pattern Recogn Lett, 2011, 32: 505–515
Zalik K R, Zalik B. Validity index for clusters of different sizes and densities. Pattern Recogn Lett, 2011, 32: 221–234
Zalik K R. Cluster validity index for estimation of fuzzy clusters of different sizes and densities. Pattern Recogn, 2010, 43: 3374–3390
Geva A B, Steinberg Y, Bruckmair S, et al. A comparison of cluster validity criteria for a mixture of normal distributed data. Pattern Recogn Lett, 2000, 21: 511–529
Dimitriadou E, Dolňicar S, Weingessel A. An examination of indexes for determining the number of clusters in binary data sets. Psychometrika, 2002, 67: 137–159
Maulik U, Bandyopadhyay S. Performance evaluation of some clustering algorithms and validity indices. IEEE Trans Pattern Anal, 2002, 24: 1650–1654
Pal N R, Bezdek J C. Correction to on cluster validity for the fuzzy c-means model. IEEE Trans Fuzzy Syst, 1997, 5: 152–153
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhou, K., Fu, C. & Yang, S. Fuzziness parameter selection in fuzzy c-means: The perspective of cluster validation. Sci. China Inf. Sci. 57, 1–8 (2014). https://doi.org/10.1007/s11432-014-5146-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11432-014-5146-0