Abstract
Generally, abnormal points (noise and outliers) cause cluster analysis to produce low accuracy especially in fuzzy clustering. These data not only stay in clusters but also deviate the centroids from their true positions. Traditional fuzzy clustering like Fuzzy C-Means (FCM) always assigns data to all clusters which is not reasonable in some circumstances. By reformulating objective function in exponential equation, the algorithm aggressively selects data into the clusters. However noisy data and outliers cannot be properly handled by clustering process therefore they are forced to be included in a cluster because of a general probabilistic constraint that the sum of the membership degrees across all clusters is one. In order to improve this weakness, possibilistic approach relaxes this condition to improve membership assignment. Nevertheless, possibilistic clustering algorithms generally suffer from coincident clusters because their membership equations ignore the distance to other clusters. Although there are some possibilistic clustering approaches that do not generate coincident clusters, most of them require the right combination of multiple parameters for the algorithms to work. In this paper, we theoretically study Possibilistic Exponential Fuzzy Clustering (PXFCM) that integrates possibilistic approach with exponential fuzzy clustering. PXFCM has only one parameter and not only partitions the data but also filters noisy data or detects them as outliers. The comprehensive experiments show that PXFCM produces high accuracy in both clustering results and outlier detection without generating coincident problems.
Similar content being viewed by others
References
MacQueen J B. Some methods for classification and analysis of multivariate observations. In Proc. the 5th Berkeley Symp. Mathematical Statistics and Probability, June 21-July 18, 1965 and Dec.27, 1965-Jan.7, 1966, Vol.1, pp.281–297.
Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybernetics 3(3):32–57
Bezdek JC (1993) Pattern Recognition with Fuzzy Objective Function Algoritms. Plenum Press, New York
Krishnapuram R, Keller JM (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Systems 1(2):98–110
Treerattanapitak K, Juruskulchai C (2012) Exponential fuzzy C-means for collaborative filtering. Journal of Computer Science & Technology 27(3):567–576
Chandola V, Banerjee A, Kumar V. Anomaly detection: A survey. ACM Comput. Surveys, 2009, 41(3): Article No. 15.
Treerattanapitak K, Juruskulchai C. Outlier detection with possibilistic exponential fuzzy clustering. In Proc. the 8th FSKD, Jul. 2011, pp.453–457.
Barni M, Cappellini V, Mecocci A (1996) Comments on “A possibilistic approach to clustering”. IEEE Trans Fuzzy Systems 4(3):393–396
Pal N R, Pal K, Bezdek J C. A mixed c-means clustering model. In Proc. the 6th IEEE Int. Conf. Fuzzy Systems, Jul. 1997, pp.11–21.
Pal NR, Pal K, Keller JM, Bezdek JC (2005) A possibilistic fuzzy c-means clustering algorithm. IEEE Trans Fuzzy Systems 13(4):517–530
Wachs J, Shapira O, Stern H. A method to enhance the “Possibilistic C-means with repulsion” algorithm based on cluster validity index. In Proc. the 9th Word Conf. Soft Computing in Industry Application, Sept. 20-Oct. 8, 2004, pp.77–87.
Yang M, Wu K (2006) Unsupervised possibilistic clustering. J Pattern Recognition 39(1):5–21
Wu X, Wu B, Sun J, Fu H (2010) Unsupervised possibilistic fuzzy clustering. J Info and Comp Sci 7(5):1075–1080
Hawkins S, He H, Williams G J et al. Outlier detection using replicator neural networks. In Proc. the 4th DaWaK, Sept. 2002, pp.170–180.
Davy M, Godsill S. Detection of abrupt spectral changes using support vector machines, an application to audio signal segmentation. In Proc. the 2002 IEEE Int. Conf. Acoustics, Speech, and Signal Processing, May. 2002, pp.1313–1316.
Ramaswamy S, Rastogi R, Shim K. E ± cient algorithms for mining outliers from large data sets. In Proc. the 6th SIGMOD Int. Conf. Management of Data, Jun. 2000, pp.427–438.
Breunig M, Kriegel H, Ng R T et al. LOF: Identifying density-based local outliers. In Proc. the 6th ACM SIGMOD Int. Conf. Management of Data, Jun. 2000, pp.93–104.
Ester M, Kriegel H P, Sander J, Xu X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proc. the 2nd the KDD, Aug. 1996, pp.226–231.
Tang C, Wang S, Xu W (2010) New fuzzy c-means clustering model based on the data weighted approach. Data Knowledge Engineering 69(9):881–900
Shahi A, Atan RB, Sulaiman MN (2009) Detecting effectiveness of outliers and noisy data on fuzzy system using FCM. European J Sci Research 36(4):627–638
He Z, Deng S, Xu X. An optimization model for outlier detection in categorical data. In Lecture Notes in Computer Science 3644, Huang D S, Zhang X P, Huang G B (eds.), Springer, 2005, pp.400–409.
Agovic A, Banerjee A, Ganguly A R, Protopopescu V. Anomaly detection in transportation corridors using manifold embedding. In Proc. the 1st SensorKDD, Aug. 2007.
Jin W, Tung K H, Han J. Mining top-n local outliers in large databases. In Proc. the 7th KDD, Aug. 2001, pp.293–298.
Xue Z, Shang Y, Feng A (2010) Semi-supervised outlier detection based on fuzzy rough C-means clustering. J Mathematics and Computers in Simulation 80(9):1911–1921
Xie XL, Beni G (1991) A validity measure for fuzzy clustering. TPAMI 13(8):841–847
Kwon SH (1998) Cluster validity index for fuzzy clustering. Electronics Letters 34(22):2176–2177
Fukuyama Y, Sugeno M. A new method of choosing the number of clusters for the fuzzy c-means method. In Proc. the 5th Fuzzy Systems Symposium, Jun. 1989, pp.247–250.
Gath I, Geva AB (1989) Unsupervised optimal fuzzy clustering. Trans Pattern Anal Mach Intell 11(7):773–781
Pakhira MK, Bandyopadhyay S, Maulik U (2004) Validity index for crisp and fuzzy clusters. Pattern Recognition 37(3):487–501
Wu KL, Yang MS (2005) A cluster validity index for fuzzy clustering. Pattern Recognition Lett 26(9):1275–1291
Aggarwal C C, Yu P S. Outlier detection for high dimensional data. In Proc. the 2001 ACM SIGMOD Int. Conf. Management of Data, May 2001, pp.37–46.
Williums G J, Baster R A, He H et al. A comparative study of RNN for outlier detection in data mining. In Proc. the 2002 Int. Conf. Data Mining, Dec. 2002, pp.709–712.
He Z, Xu X, Huang J Z, Deng S. A frequent pattern discovery method for outlier detection. In Proc. the 5th Int. Conf. Web-Age Info. Management, Jul. 2004, pp.726–732.
He Z, Xu X, Deng S (2003) Discovery cluster-based local outliers. Pattern Recognition Letters 24(9/10):1641–1650
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Treerattanapitak, K., Jaruskulchai, C. Possibilistic Exponential Fuzzy Clustering. J. Comput. Sci. Technol. 28, 311–321 (2013). https://doi.org/10.1007/s11390-013-1331-7
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-013-1331-7