Abstract
Fuzzy techniques have been used for handling vague boundaries of arbitrarily oriented clusters. However, traditional clustering algorithms tend to break down in high dimensional spaces due to inherent sparsity of data. We propose a modification in the objective function of Gustafson-Kessel clustering algorithm for projected clustering and prove the convergence of the resulting algorithm. We present the results of applying the proposed projected Gustafson-Kessel clustering algorithm to synthetic and UCI data sets, and also suggest a way of extending it to a rough set based algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abonyi, J., Feil, B.: Cluster Analysis for Data Mining and System Identification. Birkhäuser, Basel
Achtert, E., Böhm, C., David, J., Kröger, P., Zimek, A.: Robust Clustering in Arbitrarily Oriented Subspaces. In: SDM, pp. 763–774 (2008)
Aggarwal, C., Wolf, J., Yu, P., Procopiuc, C., Park, J.: Fast algorithms for projected clustering. In: ACM SIGMOD, pp. 61–72 (1999)
Agrawal, R., Gehrke, J., Gunopolos, D., Raghavan, P.: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. In: ACM SIGMOD (1998)
Asharaf, S., Murty, M.N.: A Rough Fuzzy Approach to Web Usage Categorization. In: Fuzzy Sets and Systems, pp. 119–129 (2004)
Assent, I., Krieger, R., Müller, E., Seidl, T.: DUSC: Dimensionality Unbiased Subspace Clustering. In: ICDM (2007)
Assent, I., Krieger, R., Müller, E., Seidl, T.: EDSC: Efficient density-based subspace clustering. In: Proceeding of the 17th ACM Conference on Information and Knowledge Management (2008)
Babuka, R., van der Veen, P.J., Kaymak, U.: Improved covariance estimation for Gustafson-Kessel clustering. In: FUZZ-IEEE 2002, vol. 2, pp. 1081–1085 (2002)
Bezdek, J.C.: Pattern recognition with Fuzzy Objective Function Algorithm. Plenum Press, New York (1981)
Bezdek, J.C.: An convergence theorem for the fuzzy ISODATA clustering algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 2(1), 1–8 (1980)
Bezdek, J.C., Hathaway, R., Sabin, M., Tucker, W.: Convergence theory for fuzzy c-means: counterexamples and repairs. In: Bezdek, J., Pal, S. (eds.) Fuzzy Models for Pattern Recognition: Methods that Search for Approximate Structures in Data, pp. 138–142. IEEE Press, New York (1992)
Bezdek, J.C., Coray, C., Gunderson, R., Watson, J.: Detection and characterization of cluster substructure. SIAM J. Appt. Math. 40(2), 339–372 (1981)
Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is ”nearest neighbor” meaningful? In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 217–235. Springer, Heidelberg (1998)
Bazan, J.G., Nguyen, H.S., Nguyen, S.H., Synak, P., Wroblewsk, J.: Rough Set Algorithms in Classification Problem. In: Polkowski, L., Tsumoto, S., Lin, T.Y. (eds.) New Developments in Knowledge Discovery in Information Systems, pp. 49–88. Physica Verlag, Heidelberg (2000)
Dash, M., Choi, K., Scheuermann, P., Liu, H.: Feature Selection for Clustering - A Filter Solution. In: ICDM, pp. 115–122 (2002)
Dhillon, I., Kogan, J., Nicholas, M.: A Comprehensive Survey of Text Mining, pp. 73–100. Springer, Heidelberg (2003)
Dunn, J.: A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters. J. Cybernetics 3, 32–57 (1974)
Gan, G., Wu, J., Yang, Z.: PARTCAT: A Subspace Clustering Algorithm for High Dimensional Categorical Data. In: IJCNN, pp. 4406–4412 (2006)
Gan, G., Wu, J.: A convergence theorem for the fuzzy subspace clustering (FSC) algorithm. Pattern Recognition 41, 1939–1947 (2008)
Gan, G., Wu, J., Yang, Z.: A fuzzy subspace algorithm for clustering high dimensional data. In: Li, X., Zaïane, O.R., Li, Z.-h. (eds.) ADMA 2006. LNCS (LNAI), vol. 4093, pp. 271–278. Springer, Heidelberg (2006)
Gustafson, D.E., Kessel, W.: Fuzzy clustering with a Fuzzy Covariance Matrix. In: Proc. IEEE-CDC, vol. 2, pp. 761–766 (1979)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2006) (1st ed., 2000)
Hathaway, J.R., Bezdek, C.J.: Local convergence of the fuzzy c-Means algorithms. In: Pattern Recognition, pp. 477–480 (1986)
Hoppner, F., Klawonn, F., Kruse, R., Runkler, T.: Fuzzy Cluster Analysis: Methods for Classification. In: Data Analysis, and Image Recognition. John Wiley & Sons, Chichester
Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Comput. Surv. 31(3), 264–323 (1999)
Jensen, R., Shen, Q.: A Rough Set – Aided system for Sorting WWW Book-marks. In: Zhong, N., et al. (eds.) Web Intelligence: Research and Development, pp. 95–105 (2001)
John, G.H., Kohavi, R., Pfleger., K.: Irrelevant Features and the Subset Selection Problem. In: ICML, pp. 121–129 (1994)
Kailing, K., Kriegel, H.-P., Kröger, P.: Density-Connected Subspace Clustering for High Dimensional Data, pp. 246–257. SIAM, Philadelphia (2004)
Kaufman, J., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley, New York (1990)
Kelling, K., Peter, H., Kröger, P.: A Generic Framework for Efficient Subspace Clustering of High-Dimensional Data. In: ICDM, pp. 205–257 (2005)
Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Reading (2005)
Li, Y., Shiu, S.C.K., Pal, S.K.: Combining feature reduction and case selection in building CBR classifiers. In: Pal, S.K., Aha, D.W., Gupta, K.M. (eds.) Case- Based Reasoning in Knowledge Discovery and Data Mining. Wiley, New York (2005)
Lingras, P., West, C.: Interval set clustering of Web users with rough k-means. Technical Report No. 2002-002, Dept. of Mathematics and Computer Science, St. Mary’s University, Halifax, Canada (2002)
Lingras, P.: Applications of Rough Set Based K-Means, Kohonen, GA Clustering. T. Rough Sets, VII, pp. 120–139 (2007)
Liu, H., Setiono, R.: A probabilistic approach to feature selection – a filter solution. In: Proceedings of the 9th International Conference on Industrial and Engineering Applications of AI and ES, pp. 284–292 (1996)
Luo, Y., Xiong, S.: Clustering Ensemble for Unsupervised Feature Selection. In: FSKD, pp. 445–448 (2009)
Luss, R., d’Aspremont, A.: Clustering and Feature Selection using Sparse Principal Component Analysis. Optimization & Engineering 11(1), 145–157 (2010)
Małyszko, D., Stepaniuk, J.: Rough entropy based k-means clustering. In: Sakai, H., Chakraborty, M.K., Hassanien, A.E., Ślęzak, D., Zhu, W. (eds.) RSFDGrC 2009. LNCS, vol. 5908, pp. 406–413. Springer, Heidelberg (2009)
Mitra, S., Banka, H., Pedrycz, W.: Rough-fuzzy collaborative clustering. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 36, 795–805 (2006)
Nagesh, H., Goil, S., Choudhary, A.: MAFIA: Efficient and Scalable Subspace Clustering for Very Large Data Sets, Technical Report, Northwestern Univ. (1999)
Pawlak, Z.: Rough Sets, Theoretical Aspects of Reasoning about Data. Kluwer Academic, Dordrecht (1991)
Peters, G., Lampart, M., Weber, R.: Evolutionary Rough k-Medoid Clustering. T. Rough Sets 8, 289–306 (2008)
Ruspini, E.H.: A New Approach to Clustering. Information and Control, 22–32 (1969)
Rutkowski, L.: Computational Intelligence Methods and Techniques. Springer, Heidelberg (2008)
Sequeira, K., Zaki, M.: SCHISM: A new approach for interesting subspace mining. In: Proc. IEEE ICDM, Hong Kong (2004)
Shen, Q.: Rough Feature Selection for Intelligent Classifiers. T. Rough Sets 7, 244–255 (2007)
Slezak, D.: Rough Sets and Functional Dependencies in Data: Foundations of Association Reducts. T. Computational Science 5 (2009) 182-205.
Slezak, D.: Association Reducts: A Framework for Mining Multi-attribute Dependencies. In: Hacid, M.-S., Murray, N.V., Raś, Z.W., Tsumoto, S. (eds.) ISMIS 2005. LNCS (LNAI), vol. 3488, pp. 354–363. Springer, Heidelberg (2005)
Spath, H.: Cluster Analysis Algorithms for Data Reduction and Classification. Ellis Horwood, Chichester (1980)
Xu, R.: Survey of Clustering Algorithms. IEEE Transactions on Neural Networks 16(3) (2005)
Wiswedel, B., Berthold, M.R.: Fuzzy clustering in parallel universies. In: NAFIPS, pp. 567–572 (2005)
Xie, X.L., Beni, G.: A validity measure for fuzzy clustering. Pattern Analysis and Machine Intelligence 13, 841–847 (1991)
Zadeh, L.A.: Fuzzy logic, neural networks, and soft computing. Communications of the ACM 37, 77–84 (1994)
Zangwill, W.: Non-Linear programming: A unified Approach. Prentice-Hall, Englewood Cliffs (1969)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Puri, C., Kumar, N. (2011). Projected Gustafson-Kessel Clustering Algorithm and Its Convergence. In: Peters, J.F., et al. Transactions on Rough Sets XIV. Lecture Notes in Computer Science, vol 6600. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21563-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-21563-6_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21562-9
Online ISBN: 978-3-642-21563-6
eBook Packages: Computer ScienceComputer Science (R0)