A novel cluster validity index for fuzzy C-means algorithm
- 315 Downloads
To overcome the main problem of the cluster number in many clustering applications, a new clustering approach with improved morphology similarity distance and the novel cluster validity index is proposed in this paper. An optimized morphology similarity distance based on the Standard Euclidean distance and ReliefF algorithm is used to create a new validity index, which can balance the intra-cluster consistency and inter-cluster consistency. The proposed validity index is combined with fuzzy C-means to produce a creative algorithm simply named the OMS-OSC algorithm. Experimental results obtained using different artificial data sets and real-world data sets show that the new algorithm can not only yield good performance but also detect the correct cluster number.
KeywordsClustering applications Optimized morphology similarity distance New validity index Fuzzy C-means Cluster number
This work is supported by the National Natural Science Foundation of China with the Grant Nos. 61573157, 61561024 and 61562038, the Fund of Natural Science Foundation of Guangdong Province of China with the Grant No. 2014A030313454, the Key Project of Natural Statistical Science and Research with the Grant No. 2015LZ30.
Compliance with ethical standards
Conflict of interest
The authors declares that they have no conflict of interest.
This article does not contain any studies with human participants performed by any of the authors.
- Cui HY, Xie MZ, Cai YL, Huang X, Liu YJ (2014) Cluster validity index for adaptive clustering algorithms. Inst Eng Technol 8(13):2256–2263Google Scholar
- Ester M, Kriegel H, Sander J, Xu X (1996) On knowledge discovery and data mining. In: 2nd international conference. ACM, pp 226–231Google Scholar
- Fukuyama Y, Sugeno M (1989) A new method of choosing the number of clusters for fuzzy C-means method. In: Proceedings Of the 5th fuzzy system symposium, Japanese, pp 247–250Google Scholar
- Gu B, Sheng VS (2016) A robust regularization path algorithm for \(\nu \)-support vector classification. IEEE Trans Neural Netw Learn Syst 1:1–8Google Scholar
- Gu B, Sun XM, Sheng VS (2016) Structural minimax probability machine. IEEE Trans Neural Netw Learn Syst 1:1–11Google Scholar
- Hinneburg A, Keim D (1998) An efficient approach to clustering large multimedia databases with noise. In: Proceedings of the 4th ACM SIGKDD, ACM, New York, pp 58–65Google Scholar
- Horiguchi Y, Suzuki T, Sawaragi T, Nakanishi H, Takimoto T (2016) Dominant pattern extraction from train driver’s eye-gaze data using Markov cluster algorithm. In: Joint 8th international conference on soft computing and intelligent systems and 17th international symposium on advanced intelligent systems, pp 116–122Google Scholar
- Kira K, Rendell LA (1992) A practical approach to feature selection. In: Proceedings of the 9th international workshop on machine learning, vol 48, pp 249–256Google Scholar
- Kononenko I (1994) Estimating attributes: analysis and extensions of relief. In: ECML-94 Proceeding of the European conference on machine learning on machine learning. SpringerGoogle Scholar
- Kononenko I, Robnik-Sikonja M (2003) Theoretical and empirical analysis of ReliefF and RReliefF. In: Machine learning vol 53. Springer, pp 23–69Google Scholar
- Li Z, Yuan JS, Zhang WH (2009) Fuzzy C-mean algorithm with morphology similarity distance. In: Sixth international conference on fuzzy systems and knowledge discovery. pp 90–94Google Scholar
- MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Stat 1:281–297Google Scholar
- Raymond TN, Han JW (1994) Efficient and effective clustering methods for spatial data mining. In: Proceedings of the 20th international conference on very large data bases. pp 144–155Google Scholar
- Saad MF, Adel MA (2012) Validity index and number of clusters. Int J Comput Sci Issues 9(1):52–57Google Scholar
- Wen ZW, Li RJ (2010) Fuzzy C-means clustering algorithm based on improved PSO. Appl Res Comput 27:2520–2522Google Scholar
- Xie JY, Hone K, Xie WX, Gao XB, Shi Y, Liu XH (2013) Extending twin support vector machine classifier for multi-category classification problems. Intell Data Anal 17(4):649–664Google Scholar
- Zhang Q, Yu SP, Zhou DS, Wei XP (2015) An efficient method of key-frame extraction based on a cluster algorithm. J Hum Kinet 39:5–13Google Scholar
- Zheng YH, Jeon B, Xu DH, Wu QM, Zhang H (2015) Image segmentation by generalized hierarchical fuzzy C-means algorithm. J Intell Fuzzy Syst 28(2):961–973Google Scholar
- Zhu CJ, Zhang Y (2012) Research of improved fuzzy C-mean clustering algorithm. J Henan Univ (Nat Sci) 42:92–95Google Scholar