Fifty Years of Fuzzy Logic and its Applications pp 315-335 | Cite as
Handling Noise and Outliers in Fuzzy Clustering
Abstract
Since it is an unsupervised data analysis approach, clustering relies solely on the location of the data points in the data space or, alternatively, on their relative distances or similarities. As a consequence, clustering can suffer from the presence of noisy data points and outliers, which can obscure the structure of the clusters in the data and thus may drive clustering algorithms to yield suboptimal or even misleading results. Fuzzy clustering is no exception in this respect, although it features an aspect of robustness, due to which outliers and generally data points that are atypical for the clusters in the data have a lesser influence on the cluster parameters. Starting from this aspect, we provide in this paper an overview of different approaches with which fuzzy clustering can be made less sensitive to noise and outliers and categorize them according to the component of standard fuzzy clustering they modify.
Keywords
Cluster Center Fuzzy Cluster Membership Degree Actual Cluster Cluster ParameterReferences
- 1.Everitt, B.S.: Cluster Analysis. Heinemann, London (1981)Google Scholar
- 2.Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)Google Scholar
- 3.Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)Google Scholar
- 4.Höppner, F., Klawonn, F., Kruse, R., Runkler, T.: Fuzzy Cluster Analysis. Wiley, Chichester (1999)Google Scholar
- 5.Ruspini, E.H.: A new approach to clustering. Inf. Control 15(1), 22–32 (1969). Reprinted in [47], 63–70 (Academic Press, San Diego)Google Scholar
- 6.Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)Google Scholar
- 7.Bezdek, J.C., Keller, J., Krishnapuram, R., Pal, N.: Fuzzy Models and Algorithms for Pattern Recognition and Image Processing. Kluwer, Dordrecht (1999)Google Scholar
- 8.Ohashi, Y.: Fuzzy clustering and robust estimation. In: Proceedings 9th Meeting SAS Users Group International Hollywood Beach, FL, USA (1984)Google Scholar
- 9.Davé, R.N.: Characterization and detection of noise in clustering. Pattern Recogn. Lett. 12, 657–664 (1991) (Elsevier Science, Amsterdam)Google Scholar
- 10.Davé, R.N., Sen, S.: On generalizing the noise clustering algorithms. In: Proceedings 7th International Fuzzy Systems Association World Congress (IFSA’97), 3, 205–210. Academia, Prague, Czech Republic (1997)Google Scholar
- 11.Keller, A.: Fuzzy clustering with outliers. In: Proceedings 19th Conference North American Fuzzy Information Processing Society (NAFIPS’00, Atlanta, Canada), pp. 143–147. IEEE Press, Piscataway, NJ, USA (2000)Google Scholar
- 12.Krishnapuram, R., Keller, J.M.: A possibilistic approach to clustering. IEEE Trans. Fuzzy Syst. 1(2) , 98–110 (1993) (IEEE Press, Piscataway)Google Scholar
- 13.Krishnapuram, R., Keller, J.M.: The possibilistic \(c\)-means algorithm: insights and recommendations. IEEE Trans. Fuzzy Syst. 4(3), 385–393 (1996) (IEEE Press, Piscataway)Google Scholar
- 14.Pal, N.R., Pal, K., Bezdek, J.C.: A mixed C-means clustering model. In: Proceedings 6th IEEE International Conference on Fuzzy Systems (FUZZIEEE’97, Barcelona, Spain), pp. 11–21. IEEE Press, Piscataway, NJ, USA (1997)Google Scholar
- 15.Pal, N.R., Pal, K., Keller, J.M., Bezdek, J.C.: A new hybrid C-means clustering model. In: Proceedings 13th IEEE International Conference on Fuzzy Systems (FUZZIEEE’04, Budapest, Hungary), pp. 179–184. IEEE Press, Piscataway, NJ, USA (2004)Google Scholar
- 16.Pal, N.R., Pal, K., Keller, J.M., Bezdek, J.C.: A possibilistic fuzzy \(C\)-means clustering algorithm. IEEE Trans. Fuzzy Syst. 13(4), 517–530 (2005) (IEEE Press, Piscataway)Google Scholar
- 17.Masulli, F., Rosetta, S.: Soft transition from probabilistic to possibilistic fuzzy clustering. IEEE Trans. Fuzzy Syst. 14(4), 516–527 (2006) (IEEE Press, Piscataway)Google Scholar
- 18.Honda, K., Ichihashi, H., Notsu, A., Masulli, F., Rovetta, S.: Formulations, several, for graded possibilistic approach to fuzzy clustering. In: Proceedings 5th International Conference Rough Sets and Current Trends in Computing (RSCTC, : Kobe, Japan), pp. 939–948. Springer-Verlag, Berlin/Heidelberg, Germany (2006)Google Scholar
- 19.Klawonn, F., Höppner, F.: What is fuzzy about fuzzy clustering? understanding and improving the concept of the fuzzifier. In: Proceedings 5th International Symposium on Intelligent Data Analysis (IDA: Berlin, Germany), pp. 254–264. Springer-Verlag, Berlin, Germany (2003)Google Scholar
- 20.Jajuga, K.: \(L_1\)-norm based fuzzy clustering. Fuzzy Sets Syst. 39(1), 43–50 (1991) (Elsevier Science, Amsterdam)Google Scholar
- 21.Groenen, P.J.F., Jajuga, K.: Fuzzy clustering with squared minkowski distances. Fuzzy Sets Syst. 120, 227–237 (2001) (Elsevier Science, Amsterdam)Google Scholar
- 22.Groenen, P.J.F., Kaymak, U., van Rosmalen, J.: Fuzzy clustering with minkowski distance functions. In: Chapter 3 of Valente de Oliveira, J., Pedrycz, W. (eds.) Advances in Fuzzy Clustering and Its Applications. Wiley, Chichester (2007)Google Scholar
- 23.Runkler, T.A., Bezdek, J.C.: Alternating cluster estimation: a new tool for clustering and function approximation. IEEE Trans. Fuzzy Syst. 7(4), 377–393 (1999) (IEEE Press, Piscataway)Google Scholar
- 24.Łȩski, J.: An \(\varepsilon \)-insensitive approach to fuzzy clustering. Int. J. Appl. Math. Comput. Sci. 11(4), 993–1007 (2001) (University of Zielona Góra, Poland)Google Scholar
- 25.Frigui, H., Krishnapuram, R.: A robust algorithm for automatic extraction of an unknown number of clusters from noisy data. Pattern Recogn. Lett. 17, 1223–1232 (1996) (Elsevier Science, Amsterdam)Google Scholar
- 26.Borgelt, C.: Prototype-based Classification and Clustering. Otto-von-Guericke-University of Magdeburg, Germany, Habilitationsschrift (2005)Google Scholar
- 27.Ball, G.H., Hall, D.J.: A clustering technique for summarizing multivariate data. Behav. Sci. 12(2), 153–155 (1967) (Wiley, Chichester)Google Scholar
- 28.Hartigan, J.A., Wong, M.A.: A \(k\)-means clustering algorithm. Appl. Stat. 28, 100–108 (1979) (Blackwell, Oxford)Google Scholar
- 29.Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28, 129–137 (1982) (IEEE Press, Piscataway)Google Scholar
- 30.Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3(3), 32–57 (1973). Reprinted in [47], 82–101 (American Society for Cybernetics, Washington)Google Scholar
- 31.Borgelt, C.: Objective functions for fuzzy clustering. In: Moewes, C., Nürnberger, A. (eds.) Computational Intelligence in Intelligent Data Analysis, 3–16. Springer, Berlin/Heidelberg (2012)Google Scholar
- 32.Gustafson, E.E., Kessel, W.C.: Fuzzy clustering with a fuzzy covariance matrix. In: Proceedings of the IEEE Conference on Decision and Control (CDC 1979, San Diego, CA), pp. 761–766. IEEE Press, Piscataway, NJ, USA (1979). Reprinted in [47], 117–122Google Scholar
- 33.Gath, I., Gevam, A.B.: Unsupervised optimal fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 11, 773–781 (1989). Reprinted in [47], 211–218 (IEEE Press, Piscataway)Google Scholar
- 34.Davé, R.N., Krishnapuram, R.: Robust clustering methods: a unified view. IEEE Trans. Fuzzy Syst. 5, 270–293 (1997) (IEEE Press, Piscataway)Google Scholar
- 35.Davé, R.N., Sumit, S.: Generalized noise clustering as a robust fuzzy C-M-estimators model. In: Proceedings 17th Annual Conference North American Fuzzy Information Processing Society (NAFIPS’98, Pensacola Beach, Florida), pp. 256–260. IEEE Press, Piscataway, NJ, USA (1998)Google Scholar
- 36.Klawonn, F.: Noise clustering with a fixed fraction of noise. In: Lotfi, A., Garibaldi, M. (eds.) Applications and Science in Soft Computing, 133–138. Springer, Berlin/Heidelberg (2004)Google Scholar
- 37.Rehm, F., Klawonn, F., Kruse, R.: A novel approach to noise clustering for outlier detection. Soft Comput. 11(5), 489–494. Springer, Berlin/Heidelberg (2007)Google Scholar
- 38.Cimino, M.G.C.A., Frosini, G., Lazzerini, B., Marcelloni, F.: On the noise distance in robust fuzzy C-means. In: Proceedings International Conference on Computational Intelligence (ICCI, : Istanbul, Turkey), pp. 361–364. Intelligence Society, International Compliance (2004)Google Scholar
- 39.Timm, H., Borgelt, C., Döring, C., Kruse, R.: An extension to possibilistic fuzzy cluster analysis. Fuzzy Sets Syst. 147, 3–16 (2004) (Elsevier Science, Amsterdam)Google Scholar
- 40.Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, Chichester (1987)Google Scholar
- 41.Hathaway, R.J., Devenport, J.W., Bezdek, J.C.: Relational dual of the C-means clustering algorithm. Pattern Recogn. 22(2), 205–212 (1989) (Elsevier, Amsterdam)Google Scholar
- 42.Krishnapuram, R., Joshi, A., Yi, L.: A fuzzy relative of the K-medoids algorithm with application to document and snippet clustering. In: Proceedings 8th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’99, Seoul, Korea), 3, 1281–1286. IEEE Press, Piscataway, NJ, USA (1999)Google Scholar
- 43.Sen, S., Dave, R.N.: Clustering of relational data containing noise and outliers. In: Proceedings 7th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’98, Anchorage, Alaska), 3, 1411–1416. IEEE Press, Piscataway, NJ, USA (1998)Google Scholar
- 44.Bobrowski, L., Bezdek, J.C.: C-means clustering with the \(L_1\) and \(L_\infty \) norms. IEEE Trans. Syst. Man Cybern. 21(3), 545–554 (1991) (IEEE Press, Piscataway)Google Scholar
- 45.Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A.: Robust Statistics: The Approach Based on Influence Functions. Wiley, New York (1986)Google Scholar
- 46.Binu, T., Raju, G.: A novel fuzzy clustering method for outlier detection in data mining. Int. J. Recent Trends Eng. 1(2), 161–165 (2009) (Academy Publisher, British Virgin Islands)Google Scholar
- 47.Bezdek, J.C., Pal, N.R.: Fuzzy Models for Pattern Recognition. IEEE Press, New York (1992)Google Scholar