Survey of Improved k-means Clustering Algorithms: Improvements, Shortcomings and Scope for Further Enhancement and Scalability

Khandare, Anand; Alvi, A. S.

doi:10.1007/978-81-322-2752-6_48

Anand Khandare¹⁸ &
A. S. Alvi¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 434))

1637 Accesses
7 Citations

Abstract

Clustering algorithms are popular algorithms used in various fields of science and engineering and technologies. The k-means is example unsupervised clustering algorithm used in various applications such as medical images clustering, gene data clustering etc. There is huge research work done on basic k-means clustering algorithm for its enhancement. But researchers focused only on some of the limitations of k-means. This paper studied some of literatures on improved k-means algorithms, summarized their shortcomings and identified scope for further enhancement to make it more scalable and efficient for large data. From the literatures this paper studied distance, validity and stability measures, algorithms for initial centroids selection and algorithms to decide value of k. Then proposing objectives and guidelines for enhanced scalable clustering algorithm. Also suggesting method to avoid outliers using concept of semantic analysis and AI.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Robert Harrison, Phang C., Zhong, Gulsah Altun, Tai, and Yi Pan: Improved k-means clustering algo. For exploring protein sequence motifs representing common structural property, ieee trans on nanobio, vol 4, no 3 (2005).
Google Scholar
V. N. Manjunath Aradhya, M. S. Pavithra: An application of k -means clustering for improving video, text detection, advances in intelligent systems and computing volume 182, 2013, pp 41–47(2012).
Google Scholar
C. Rajalaxmi, K. P. Soman, S. Padmavathi: Texel identification using k-means clustering method, advances in intelligent systems and computing volume 167, pp 285–294(2012).
Google Scholar
Dimitrios Charalampidis: A modified k-means algorithm for circular invariant clustering, ieee trans, on patrn anlyss, vol. 27, no. 12 (2005).
Google Scholar
Kunchev and Dmitry P. Vetrov, Ludmila I.: Evaluation of stability of k-means cluster ensembles with respect to random initialization, ieee tran. On patrn. analys. and machine intelligence, vol. 28, no. 11 (2006).
Google Scholar
Wenyuan Li, Wee-Keong NG, Ying Liu, Member, and Kok-Leong Ong: Enhancing the effectiveness of clustering with spectra analysis, ieee trans on knowledge and data engineering, vol. 19, no. 7 (2007).
Google Scholar
Sanghamitra Bandyopadhyay and Sriparna Saha: A point symmetry-based clustering technique for automatic evolution of clusters, ieee transactions on knowledge and data engineering, vol. 20, no. 11 (2008).
Google Scholar
Yi Hong and Sam Kwong: Learning the assignment order of instances for the constrained k-means clustering algorithm, ieee trans on systems, man, and cybernetics—part b: cybernetics, vol. 39, no. 2(2009).
Google Scholar
Sanghamitra Bandyopadhyay, and Sriparna Saha: Performance evaluation of some symmetry- based cluster validity indexes, ieee transactions on systems, man, and cybernetics—part c: vol. 39, no. 4 (2009).
Google Scholar
Pawan Lingras, Min Chen, and Duoqian Miao: Rough cluster quality index based on decision theory, ieee trans. on knowldge and data engineering, vol. 21, no. 7(2009).
Google Scholar
Nor Ashidi Mat Isa, Samy A. Salamah, Umi Kalthum Ngah: This is an adaptive fuzzy moving algorithm of k-means clustering for image segmentation, ieee trans on consumer electronics, vol. 55, no. 4 (2009).
Google Scholar
Juntao Wang, Xiaolong Su: An improved k-means clustering algorithm, comm. software and networks ieee 3rd international conference (2011).
Google Scholar
Kong Dexi, Kong Rui: A fast and effective kernel-based k-means clustering algorithm, ieee conf on intelligent system design and engg. Applications (2013).
Google Scholar
Jiye Liang, Liang Bai, Chuangyin Dang, and Fuyuan Cao: The k-means-type algorithms versus imbalanced data distributions, ieee trans. on fuzzy systems, vol. 20, no. 4 (2012).
Google Scholar
Partha Sarathi Bishnu and Vandana Bhattacherjee: Software fault prediction using quad tree-based k-means clustering algorithm, ieee trans on knowg and data engg, vol. 24, no. 6 (2012).
Google Scholar
Xiaojun Chen, Xiaofei Xu, Joshua Zhexue Huang, and Yunming Ye: TW-k-means: automated two-level variable weighting clustering algorithm for multiview data, ieee trans. on knowlge and data engi, vol. 25, no. 4 (2013).
Google Scholar
Jie Cao, Zhiang Wu, Junjie Wu, Member, Ieee, and Hui Xiong: Sail: Summation-based incremental learning for information-theoretic text clustering, ieee trans on cybernetics, vol. 43, no. 2 (2013).
Google Scholar
Rui Máximo Esteves, Thomas Hacker, Chunming Rong: Competitive k-means-a new accurate and distributed k-means algorithm for large datasets, ieee international conference on cloud computing technology and science (2013).
Google Scholar
Zhiwen Yu, Hongsheng Chen, Jane You, Hau-San Wong, Jiming Liu, Le Li, and Guoqiang Han: Double selection based semi-supervised clustering ensemble for tumor clustering from gene expression profiles, ieee/acm trans on computational biology and bioimrmtcs, vol. 11, no. 4 (2014).
Google Scholar
Kazuki Ichikawa and Shinichi Morishita: A simple and powerful heuristic method for accelerating k-means clustering of large-scale data in life science, ieee/acm trans compt biology and bioinf, vol. 11, no. 4(2014).
Google Scholar
Qinpei Zhao and Pasi Fränti: Centroid ratio for a pairwise random swap clustering algorithm, ieee trans on knowld and data engg,. V. 26, no. 5 (2014).
Google Scholar
Hongyan Cui, Mingzhi Xie, Yunlong Cai, Xu Huang, Yunjie Liu: Cluster validity index for adaptive clustering algorithms, iet commun., vol. 8, iss. 13(2014).
Google Scholar
Liping Jing, Michael K. Ng, and Joshua Zhexue Huang: An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data, ieee trans. on knowledge and data engg, vol. 19, no. 8 (2007).
Google Scholar
Chuan Ming Chen, Dechang Pi and Zhuoran Fang: Artificial immune k-means grid- density clustering algorithm for real-time monitoring and analysis of urban traffic”, electronics letters vol. 49 no. 20 pp. 1272–1273 (2013).
Google Scholar
Hoel Le Capitaine and Carl Fr´elicot: A cluster-validity index combining an overlap measure and a separation measure based on fuzzy-aggregation operators, ieee trans on fuzzy systems, vol. 19, no. 3 (2011).
Google Scholar
Pilsung Kang, Sungzoon Cho: k-means clustering seeds initialization based on centrality, sparsity, and isotropy, lncs vol 5788, 2009, pp 109–117(2009).
Google Scholar
Fasahat Ullah Siddiqui and Nor Ashidi Mat Isa: Enhanced moving k-means algo. for image segmentation”, ieee trans. On cons electrcs, vl. 57, no 2 (2011).
Google Scholar
Jonathon K. Parker, and Lawrence O. Hall: Accelerating fuzzy-c means using an estimated subsample size”, ieee trans on fuzzy system, vo. 22, no. 5(2014).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of CSE, SGB Amravati University, Amravati, India
Anand Khandare
Department of CSE, PRMIT&R, Badnera, Amravati, India
A. S. Alvi

Authors

Anand Khandare
View author publications
You can also search for this author in PubMed Google Scholar
A. S. Alvi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anand Khandare .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Anil Neerukonda Institute of Technology and Sciences, Visakhapatnam, India
Suresh Chandra Satapathy
Kalyani University, Nadia, West Bengal, India
Jyotsna Kumar Mandal
University of Hyderabad, Hyderabad, India
Siba K. Udgata
Department of Electronics and Communication Engineering, Shri Ramswaroop Memorial Group of Professional Colleges, Lucknow, Uttar Pradesh, India
Vikrant Bhateja

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khandare, A., Alvi, A.S. (2016). Survey of Improved k-means Clustering Algorithms: Improvements, Shortcomings and Scope for Further Enhancement and Scalability. In: Satapathy, S.C., Mandal, J.K., Udgata, S.K., Bhateja, V. (eds) Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol 434. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2752-6_48

Download citation

DOI: https://doi.org/10.1007/978-81-322-2752-6_48
Published: 03 February 2016
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2750-2
Online ISBN: 978-81-322-2752-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics