Abstract
Interval valued type-2 fuzziness can be represented by means of membership functions obtained with upper and lower values of the level of fuzziness. These upper and lower values for the level of fuzziness in FCM algorithm were obtained in our previous studies. A particular application of Interval valued type-2 fuzziness is shown for cluster validity analysis in this chapter. For this purpose, we introduce a brief taxonomy for cluster validity indices to clarify the contribution of our novel approach. To provide reproducibility of our technique, the source code is written in freely available language ‘R’ and can be found on our web site.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
‘R’ SW can be downloaded from http://cran.r-project.org/ web site.
- 2.
- 3.
- 4.
Euclidian distance is defined as square root of \( \sum\limits_{i} {(x_{i} - y_{i} )^{2} } \) and Manhattan is \( \sum\limits_{i} {abs(x_{i} - y_{i} )} . \)
- 5.
Kim et al. [27], used similar intuition and suggested that the optimal number of clusters can be found by minimizing the change in cluster center with respect to the number of clusters.
- 6.
\( l_{\infty } \) norm is defined as \( \lim_{p \to \infty } l_{p} \) where \( l_{p} \) is p-norm. Since p-norm is given as \( \left\| {v_{c}^{{m_{u} }} - v_{c}^{{m_{l} }} } \right\|_{p} = \sum\limits_{i = 1}^{nv} {((v_{c,i}^{{m_{u} }} )^{p} - (v_{c,i}^{{m_{l} }} )^{p} )^{\frac{1}{p}} } \), hence, \( \lim_{p \to \infty } \left\| {v_{c}^{{m_{u} }} - v_{c}^{{m_{l} }} } \right\|_{\infty } = \max_{i = 1}^{nv} \left| {v_{c,i}^{{m_{u} }} - v_{c,i}^{{m_{l} }} } \right| \).
- 7.
Iris data is available in R SW. Wine data can be downloaded from http://archive.ics.uci.edu/ml/machine-learning-databases/wine/ manually or by using R SW as it is shown in “iris and wine data ex.r” script file.
- 8.
- 9.
This software can be downloaded from http://cran.r-project.org/. There are also several good documents related to this statistical computing environment and there are more than 3500 packages prepared already.
References
Ben-Hur, A., Elisseeff, A., Guyon, I.: A stability based method for discovering structure in clustered data. In: Pacific Symposium on Biocomputing, vol. 7, pp. 6–17. World Scientific Publishing Co., New Jersey (2002)
Ben-Hur, A., Guyon, I.: Detecting stable clusters using principal component analysis. In: Brownstein, M.J., Kohodursky, A. (eds.) Methods in Molecular Biology, pp. 159–182 Humana Press, Clifton, (2003)
Bezdek, JC.: Fuzzy mathematics in pattern classification. Dissertation, Applied Mathematics Center, Cornell University, Ithaca, (1973)
Bezdek, J.C.: Cluster validity with fuzzy sets. J. Cybernet. 3, 58–72 (1974)
Bezdek, JC.: Mathematical models for systematics and taxonomy. In: Estabrook, G. (ed.) Proceeding of 8th International Conference on Numerical Taxonomy, pp. 143–166. Freeman, San Francisco (1975)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
Bolshakova, N., Azuaje, F.: Cluster validation techniques for genome expression data. Sig. Process 83, 825–833 (2003)
Bolshakova, N., Azuaje, F., Cunningham, P.: A knowledge-driven approach to cluster validity assessment. Bioinformatics 21, 2546–2547 (2005)
Boudraa, A.O.: Dynamic estimation of number of clusters in data set. Electron. Lett. 35, 1606–1608 (1999)
Bouguessa, M., Wang, S., Sun, H.: An objective approach to cluster validation. Pattern Recogn. Lett. 27, 1419–1430 (2006)
Breckenridge, J.: Replicating cluster analysis: method, consistency and validity. Multivar. Behav. Res. 24, 147–161 (1989)
Brock, G.N., Pihur, V., Datta, S., Datta, S.: clValid: an R package for cluster validation. J. Stat. Softw. 251, 22 (2008). http://www.jstatsoft.org/v25/i04
Celikyilmaz, A., Türksen, I.B.: Enhanced fuzzy system models with improved fuzzy clustering algorithm. IEEE Trans. Fuzzy Syst. 16, 779–794 (2008)
Celikyilmaz, A., Türksen, I.B.: Validation criteria for enhanced fuzzy clustering. Pattern Recogn. Lett. 29, 97–108 (2008)
Chen, M.Y., Linkens, A.: Rule-base self-generation and simplification for data-driven fuzzy models. J. Fuzzy Sets Syst. 142, 243–265 (2004)
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1, 224–227 (1979)
Dudoit, S., Fridlyand, J.: A prediction based re-sampling method for estimating the number of clusters in a data set. Genome Biol. 3, 1–21 (2002)
Dunn, J.C.: Well separated clusters and fuzzy partitions. J. Cybern. 4(1974), 95–104 (1974)
Falasconi, M., Gutierrez, A., Pardo, M., Sberveglieri, G., Marco, S.: A stability based validity method for fuzzy clustering. Pattern Recogn. 43, 1292–1305 (2010)
Fukuyama, Y., Sugeno, M.: A new method of choosing the number of clusters for the fuzzy c-means method. In: Proceedings of Fifth Fuzzy Systems Symposium, pp. 247–250 (in Japanese) (1989)
Gan, G., Chaoqun, M., Jianhong, W.: Data Clustering: Theory, Algorithms, and Applications. ASA-SIAM Series on Statistics and Applied Probability. SIAM, Philadelphia, ASA, Alexandria (2007)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: Clustering validity methods part I. ACM SIGMOD Rec. 31, 40–45 (2002)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: Clustering validity methods part II. ACM SIGMOD Rec. 31, 19–27 (2002)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. Intell. Inf. Syst. J. 17, 107–145 (2001). Kluwer Pulishers
Handl, J., Knowles, J., Kell, D.B.: Computational cluster validation in post-genomic data analysis. Bioinformatics 21, 3201–3212 (2005)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)
Kim, D.W., Lee, K.H., Lee, D.: On cluster validity index for estimation of the optimal number of fuzzy clusters. Pattern Recogn. 37, 2009–2025 (2004)
Kwon, S.K.: Cluster validity index for fuzzy clustering. Electron. Lett. 34, 2176–2177 (1998)
Lange, T., Roth, V., Braun, M.L., Buhmann, J.M.: Stability based validation of clustering solutions. Neural Comput. 16, 1299–1323 (2004)
Levine, E., Domany, E.: Resampling method for unsupervised estimation of cluster validity. Neural Comput. 13, 2573–2593 (2001)
Mufti, G.B., Bertrand, P., El Moubarki, L.: Determining the number of groups from measures of cluster validity. In: Proceedings of ASMDA2005, pp. 404–414 (2005)
Ozkan, I., Türksen, I.B.: Entropy assessment of type-2 fuzziness. IEEE Int. Conf. Fuzzy Syst. 2, 1111–1115 (2004)
Ozkan, I., Turksen, I.B.: Upper and lower values for the level of fuzziness in FCM. Inf. Sci. 177, 5143–5152 (2007)
Ozkan, I., Turksen, I.B.: MiniMax ɛ-stable cluster validity index for type-2 fuzziness. Inf. Sci. 184, 64–74 (2007)
Pal, N.R., Bezdek, J.C.: On cluster validity for the fuzzy c-means model. IEEE Trans. Fuzzy Syst. 3, 370–379 (1995)
Pascual, D., Pla, F., Sánchez, J.S.: Cluster validation using information stability measures. Pattern Recogn. Lett. 31, 454–461 (2010)
Rezaee, M.R., Lelieveldt, B.P.F., Reiber, J.H.C.: A new cluster validity index for the fuzzy c-means. Pattern Recogn. Lett. 19, 237–246 (1998)
Salem, S.A., Nandi, A.K.: Development of assessment criteria for clustering algorithms. Pattern Anal. Appl. 12, 79–98 (2009)
Shannon, C.E.: The mathematical theory of communication. Bell Syst. Tech. J. 27,379–423 and 27, 623–656 (in two parts) (1948)
Sugar, C., James, G.: Finding the number of clusters in a data set: an information theoretic approach. J. Am. Stat. Assoc. 98, 750–763 (2003)
Sugar, C., Lenert, L., Olshen, R.: An application of cluster analysis to health services research: empirically defined health states for depression from the sf-12. Technical Report, Stanford University, Stanford (1999)
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. B Part 2, 63, 411–423 (2001)
Tibshirani, R., Walther, G.: Cluster validation by prediction strength. J. Comput. Graph. Stat. 14, 511–528 (2005)
Volkovich, Z., Barzily, Z., Morozensky, L.: A statistical model of cluster stability. Pattern Recogn. 41, 2174–2188 (2008)
Wang, W., Zhang, Y.: On fuzzy cluster validity indices. Fuzzy Sets Syst. 158, 2095–2117 (2007)
Wu, K.L., Yang, M.S., Hsieh, J.N.: Robust cluster validity indexes. Pattern Recogn. 42, 2541–2550 (2009)
Xie, X.L., Beni, G.: A validity measure for fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 13, 841–847 (1991)
Yu, J., Cheng, Q., Huang, H.: Analysis of the weighting exponent in the FCM. IEEE Trans. Syst. Man Cybern. B 34, 634–639 (2004)
Zhang, Y., Wang, W., Zhang, X., Li, Y.: A cluster validity index for fuzzy clustering. Inf. Sci. 178, 1205–1218 (2008)
Yue, S., Wang, J.S., Wu, T., Wang, H.: A new separation measure for improving the effectiveness of validity indices. Inf. Sci. 180, 748–764 (2010)
Acknowledgments
This work was partially supported by the Natural Science and Engineering Research Council (NSERC) Grant (RPGIN 7698-05) to University of Toronto. Also, partial support is provided by Hacettepe University and TOBB Economics and Technology University. Their support is greatly appreciated.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Ozkan, I., Burhan Türkşen, I. (2013). A Review of Cluster Validation with an Example of Type-2 Fuzzy Application in R. In: Sadeghian, A., Mendel, J., Tahayori, H. (eds) Advances in Type-2 Fuzzy Sets and Systems. Studies in Fuzziness and Soft Computing, vol 301. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6666-6_14
Download citation
DOI: https://doi.org/10.1007/978-1-4614-6666-6_14
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6665-9
Online ISBN: 978-1-4614-6666-6
eBook Packages: EngineeringEngineering (R0)