A new initialization and performance measure for the rough k-means clustering

Murugesan, Vijaya Prabhagar; Murugesan, Punniyamoorthy

doi:10.1007/s00500-019-04625-9

A new initialization and performance measure for the rough k-means clustering

Methodologies and Application
Published: 02 January 2020

Volume 24, pages 11605–11619, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

Vijaya Prabhagar Murugesan¹ &
Punniyamoorthy Murugesan¹

435 Accesses
18 Citations
Explore all metrics

Abstract

A new initialization algorithm is proposed in this study to address the issue of random initialization in the rough k-means clustering algorithm refined by Peters. A new means to choose appropriate zeta values in Peters algorithm is proposed. Also, a new performance measure S/O [within-variance (S)/total-variance (O)] index has been introduced for the rough clustering algorithm. The performance criteria such as root-mean-square standard deviation, S/O index, and running time complexity are used to validate the performance of the proposed and random initialization with that of Peters. In addition, other popular initialization algorithms like k-means⁺⁺, Peters Π, Bradley, and Ioannis are also herein compared. It is found that our proposed initialization algorithm has performed better than the existing initialization algorithms with Peters refined rough k-means clustering algorithm on different datasets with varying zeta values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comprehensive Survey of Clustering Algorithms

Article 01 June 2015

Data clustering: application and trends

Article 27 November 2022

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

Article Open access 06 November 2019

References

Arthur D, Vassilvitskii S (2007) K-Means⁺⁺: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms, pp 1027–1025. https://doi.org/10.1145/1283383.1283494
Bhargava R, Tripathy BK, Tripathy A et al (2013) Rough intuitionistic fuzzy C-means algorithm and a comparative analysis. In: Compute 2013—6th ACM India computing convention: next generation computing paradigms and technologies
Bradley PS, Fayyad UM (1998) Refining initial points for K-means clustering. In: ICML proceedings of the fifteenth international conference on machine learning, 24–27 July 1998, pp 91–99. ISBN:1-55860-556-8
Bubeck S, Meila M, von Luxburg U (2009) How the initialization affects the stability of the k-means algorithm. ESAIM Probab Stat 16:436–452. https://doi.org/10.1051/ps/2012013
Article MathSciNet MATH Google Scholar
Darken C, Moody J (1990) Fast adaptive k-means clustering: some empirical results. In: IJCNN international joint conference on neural networks, San Diego, CA, USA, 17–21 June 1990, pp 233–238. https://doi.org/10.1109/IJCNN.1990.137720
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell PAMI-1:224–227. https://doi.org/10.1109/TPAMI.1979.4766909
Article Google Scholar
Deng W, Zhao H, Yang X et al (2017a) Study on an improved adaptive PSO algorithm for solving multi-objective gate assignment. Appl Soft Comput J 59:288–302. https://doi.org/10.1016/j.asoc.2017.06.004
Article Google Scholar
Deng W, Zhao H, Zou L et al (2017b) A novel collaborative optimization algorithm in solving complex optimization problems. Soft Comput 21:4387–4398. https://doi.org/10.1007/s00500-016-2071-8
Article Google Scholar
Deng W, Xu J, Zhao H (2019) An improved ant colony optimization algorithm based on hybrid strategies for scheduling problem. IEEE Access 7:20281–20292. https://doi.org/10.1109/ACCESS.2019.2897580
Article Google Scholar
Fisher RA (1954) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188
Article Google Scholar
Forgy CL (1982) Rete: a fast algorithm for the many patterns/many object pattern match problem. Artif Intell 19:17–37. https://doi.org/10.1016/0004-3702(82)90020-0
Article Google Scholar
Gonzalez F (1985) Clustering to minimize intercluster distance. Theor Comput Sci 38:293–306
Article MathSciNet Google Scholar
Halkidi M, Batistakis Y, Vazirgiannis M (2002) Clustering validity checking methods. ACM SIGMOD Rec 31:19. https://doi.org/10.1145/601858.601862
Article MATH Google Scholar
Han J, Kamber M, Pei J (2012) Data mining: concepts and techniques. Morgan Kaufmann Publishers, Waltham
MATH Google Scholar
Hu J, Li T, Wang H, Fujita H (2016) Hierarchical cluster ensemble model based on knowledge granulation. Knowl Based Syst 91:179–188. https://doi.org/10.1016/j.knosys.2015.10.006
Article Google Scholar
Jain AK, Dubes C (1988) Algorithms for clustering data_Jain.pdf. Prentice Hall, Englewood Cliffs
MATH Google Scholar
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31:264–323. https://doi.org/10.1145/331499.331504
Article Google Scholar
Katsavounidis I, Kuo CCJ, Zhang Z (1994) A new initialization technique for generalized Lloyd iteration. IEEE Signal Process Lett 1:144–146. https://doi.org/10.1109/97.329844
Article Google Scholar
Kim EH, Oh SK, Pedrycz W (2018) Design of reinforced interval Type-2 fuzzy C-means-based fuzzy classifier. IEEE Trans Fuzzy Syst. https://doi.org/10.1109/TFUZZ.2017.2785244
Article Google Scholar
Lingras P, Peters G (2011) Rough clustering. Wiley Interdiscip Rev Data Min Knowl Discov 1:64–72. https://doi.org/10.1002/widm.16
Article Google Scholar
Lingras P, Triff M (2016) Advances in rough and soft clustering: meta-clustering, dynamic clustering, data-stream clustering. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). 9920 LNAI, pp 3–22. https://doi.org/10.1007/978-3-319-47160-0_1
Lingras P, West C (2004) Interval set clustering of web users with rough K-means. J Intell Inf Syst 23:5–16. https://doi.org/10.1023/B:JIIS.0000029668.88665.1a
Article MATH Google Scholar
Lord E, Willems M, Lapointe FJ, Makarenkov V (2017) Using the stability of objects to determine the number of clusters in datasets. Inf Sci (NY) 393:29–46. https://doi.org/10.1016/j.ins.2017.02.010
Article Google Scholar
Maji P, Pal SK (2007) Rough set based generalized fuzzy C-means algorithm and quantitative indices. IEEE Trans Syst Man Cybern Part B 37:1529–1540. https://doi.org/10.1109/TSMCB.2007.906578
Article Google Scholar
Mitra S, Banka H (2007) Application of rough sets in pattern recognition. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). pp 151–169
Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11:341–356. https://doi.org/10.1007/BF01001956
Article MATH Google Scholar
Pawlak Z, Skowron A (2007) Rough sets: some extensions 177:28–40. https://doi.org/10.1016/j.ins.2006.06.006
Article Google Scholar
Peters G (2006) Some refinements of rough k-means clustering. Pattern Recognit 39:1481–1491. https://doi.org/10.1016/j.patcog.2006.02.002
Article MATH Google Scholar
Peters G (2014) Rough clustering utilizing the principle of indifference. Inf Sci (NY) 277:358–374. https://doi.org/10.1016/j.ins.2014.02.073
Article MathSciNet Google Scholar
Peters G (2015) Is there any need for rough clustering? Pattern Recognit Lett 53:31–37. https://doi.org/10.1016/j.patrec.2014.11.003
Article Google Scholar
Peters G, Crespo F, Lingras P, Weber R (2013) Soft clustering—fuzzy and rough approaches and their extensions and derivatives. Int J Approx Reason 54:307–322. https://doi.org/10.1016/j.ijar.2012.10.003
Article MathSciNet Google Scholar
Stetco A, Zeng XJ, Keane J (2015) Fuzzy C-means⁺⁺: fuzzy C-means with effective seeding initialization. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2015.05.014
Article Google Scholar
Su T, Dy JG (2006) In search of deterministic methods for initializing K-means and Gaussian mixture clustering. Intell Data Anal 11:319–338. https://doi.org/10.3233/ida-2007-11402
Article Google Scholar
Zadeh LA (1965) Fuzzy sets. Inf Control 8:338–353. https://doi.org/10.1016/S0019-9958(65)90241-X
Article MATH Google Scholar
Zhang K (2019) A three-way c-means algorithm. Appl Soft Comput J. https://doi.org/10.1016/j.asoc.2019.105536
Article Google Scholar
Zhang T, Ma F, Yue D et al (2019) Interval Type-2 fuzzy local enhancement based rough k-means clustering considering imbalanced clusters. IEEE Trans Fuzzy Syst. https://doi.org/10.1109/tfuzz.2019.2924402
Article Google Scholar
Zhao H, Liu H, Xu J, Deng W (2019a) Performance prediction using high-order differential mathematical morphology gradient spectrum entropy and extreme learning machine. IEEE Trans Instrum Meas. https://doi.org/10.1109/TIM.2019.2948414
Article Google Scholar
Zhao H, Zheng J, Xu J, Deng W (2019b) Fault diagnosis method based on principal component analysis and broad learning system. IEEE Access 7:99263–99272. https://doi.org/10.1109/access.2019.2929094
Article Google Scholar

Download references

Author information

Authors and Affiliations

National Institute of Technology, Tiruchirappalli, Tamil Nadu, 620015, India
Vijaya Prabhagar Murugesan & Punniyamoorthy Murugesan

Authors

Vijaya Prabhagar Murugesan
View author publications
You can also search for this author in PubMed Google Scholar
Punniyamoorthy Murugesan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Punniyamoorthy Murugesan.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Murugesan, V.P., Murugesan, P. A new initialization and performance measure for the rough k-means clustering. Soft Comput 24, 11605–11619 (2020). https://doi.org/10.1007/s00500-019-04625-9

Download citation

Published: 02 January 2020
Issue Date: August 2020
DOI: https://doi.org/10.1007/s00500-019-04625-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new initialization and performance measure for the rough k-means clustering

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Data clustering: application and trends

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A new initialization and performance measure for the rough k-means clustering

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Data clustering: application and trends

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation