Skip to main content

Survey of Improved k-means Clustering Algorithms: Improvements, Shortcomings and Scope for Further Enhancement and Scalability

  • Conference paper
  • First Online:
Information Systems Design and Intelligent Applications

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 434))

Abstract

Clustering algorithms are popular algorithms used in various fields of science and engineering and technologies. The k-means is example unsupervised clustering algorithm used in various applications such as medical images clustering, gene data clustering etc. There is huge research work done on basic k-means clustering algorithm for its enhancement. But researchers focused only on some of the limitations of k-means. This paper studied some of literatures on improved k-means algorithms, summarized their shortcomings and identified scope for further enhancement to make it more scalable and efficient for large data. From the literatures this paper studied distance, validity and stability measures, algorithms for initial centroids selection and algorithms to decide value of k. Then proposing objectives and guidelines for enhanced scalable clustering algorithm. Also suggesting method to avoid outliers using concept of semantic analysis and AI.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Robert Harrison, Phang C., Zhong, Gulsah Altun, Tai, and Yi Pan: Improved k-means clustering algo. For exploring protein sequence motifs representing common structural property, ieee trans on nanobio, vol 4, no 3 (2005).

    Google Scholar 

  2. V. N. Manjunath Aradhya, M. S. Pavithra: An application of k -means clustering for improving video, text detection, advances in intelligent systems and computing volume 182, 2013, pp 41–47(2012).

    Google Scholar 

  3. C. Rajalaxmi, K. P. Soman, S. Padmavathi: Texel identification using k-means clustering method, advances in intelligent systems and computing volume 167, pp 285–294(2012).

    Google Scholar 

  4. Dimitrios Charalampidis: A modified k-means algorithm for circular invariant clustering, ieee trans, on patrn anlyss, vol. 27, no. 12 (2005).

    Google Scholar 

  5. Kunchev and Dmitry P. Vetrov, Ludmila I.: Evaluation of stability of k-means cluster ensembles with respect to random initialization, ieee tran. On patrn. analys. and machine intelligence, vol. 28, no. 11 (2006).

    Google Scholar 

  6. Wenyuan Li, Wee-Keong NG, Ying Liu, Member, and Kok-Leong Ong: Enhancing the effectiveness of clustering with spectra analysis, ieee trans on knowledge and data engineering, vol. 19, no. 7 (2007).

    Google Scholar 

  7. Sanghamitra Bandyopadhyay and Sriparna Saha: A point symmetry-based clustering technique for automatic evolution of clusters, ieee transactions on knowledge and data engineering, vol. 20, no. 11 (2008).

    Google Scholar 

  8. Yi Hong and Sam Kwong: Learning the assignment order of instances for the constrained k-means clustering algorithm, ieee trans on systems, man, and cybernetics—part b: cybernetics, vol. 39, no. 2(2009).

    Google Scholar 

  9. Sanghamitra Bandyopadhyay, and Sriparna Saha: Performance evaluation of some symmetry- based cluster validity indexes, ieee transactions on systems, man, and cybernetics—part c: vol. 39, no. 4 (2009).

    Google Scholar 

  10. Pawan Lingras, Min Chen, and Duoqian Miao: Rough cluster quality index based on decision theory, ieee trans. on knowldge and data engineering, vol. 21, no. 7(2009).

    Google Scholar 

  11. Nor Ashidi Mat Isa, Samy A. Salamah, Umi Kalthum Ngah: This is an adaptive fuzzy moving algorithm of k-means clustering for image segmentation, ieee trans on consumer electronics, vol. 55, no. 4 (2009).

    Google Scholar 

  12. Juntao Wang, Xiaolong Su: An improved k-means clustering algorithm, comm. software and networks ieee 3rd international conference (2011).

    Google Scholar 

  13. Kong Dexi, Kong Rui: A fast and effective kernel-based k-means clustering algorithm, ieee conf on intelligent system design and engg. Applications (2013).

    Google Scholar 

  14. Jiye Liang, Liang Bai, Chuangyin Dang, and Fuyuan Cao: The k-means-type algorithms versus imbalanced data distributions, ieee trans. on fuzzy systems, vol. 20, no. 4 (2012).

    Google Scholar 

  15. Partha Sarathi Bishnu and Vandana Bhattacherjee: Software fault prediction using quad tree-based k-means clustering algorithm, ieee trans on knowg and data engg, vol. 24, no. 6 (2012).

    Google Scholar 

  16. Xiaojun Chen, Xiaofei Xu, Joshua Zhexue Huang, and Yunming Ye: TW-k-means: automated two-level variable weighting clustering algorithm for multiview data, ieee trans. on knowlge and data engi, vol. 25, no. 4 (2013).

    Google Scholar 

  17. Jie Cao, Zhiang Wu, Junjie Wu, Member, Ieee, and Hui Xiong: Sail: Summation-based incremental learning for information-theoretic text clustering, ieee trans on cybernetics, vol. 43, no. 2 (2013).

    Google Scholar 

  18. Rui Máximo Esteves, Thomas Hacker, Chunming Rong: Competitive k-means-a new accurate and distributed k-means algorithm for large datasets, ieee international conference on cloud computing technology and science (2013).

    Google Scholar 

  19. Zhiwen Yu, Hongsheng Chen, Jane You, Hau-San Wong, Jiming Liu, Le Li, and Guoqiang Han: Double selection based semi-supervised clustering ensemble for tumor clustering from gene expression profiles, ieee/acm trans on computational biology and bioimrmtcs, vol. 11, no. 4 (2014).

    Google Scholar 

  20. Kazuki Ichikawa and Shinichi Morishita: A simple and powerful heuristic method for accelerating k-means clustering of large-scale data in life science, ieee/acm trans compt biology and bioinf, vol. 11, no. 4(2014).

    Google Scholar 

  21. Qinpei Zhao and Pasi Fränti: Centroid ratio for a pairwise random swap clustering algorithm, ieee trans on knowld and data engg,. V. 26, no. 5 (2014).

    Google Scholar 

  22. Hongyan Cui, Mingzhi Xie, Yunlong Cai, Xu Huang, Yunjie Liu: Cluster validity index for adaptive clustering algorithms, iet commun., vol. 8, iss. 13(2014).

    Google Scholar 

  23. Liping Jing, Michael K. Ng, and Joshua Zhexue Huang: An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data, ieee trans. on knowledge and data engg, vol. 19, no. 8 (2007).

    Google Scholar 

  24. Chuan Ming Chen, Dechang Pi and Zhuoran Fang: Artificial immune k-means grid- density clustering algorithm for real-time monitoring and analysis of urban traffic”, electronics letters vol. 49 no. 20 pp. 1272–1273 (2013).

    Google Scholar 

  25. Hoel Le Capitaine and Carl Fr´elicot: A cluster-validity index combining an overlap measure and a separation measure based on fuzzy-aggregation operators, ieee trans on fuzzy systems, vol. 19, no. 3 (2011).

    Google Scholar 

  26. Pilsung Kang, Sungzoon Cho: k-means clustering seeds initialization based on centrality, sparsity, and isotropy, lncs vol 5788, 2009, pp 109–117(2009).

    Google Scholar 

  27. Fasahat Ullah Siddiqui and Nor Ashidi Mat Isa: Enhanced moving k-means algo. for image segmentation”, ieee trans. On cons electrcs, vl. 57, no 2 (2011).

    Google Scholar 

  28. Jonathon K. Parker, and Lawrence O. Hall: Accelerating fuzzy-c means using an estimated subsample size”, ieee trans on fuzzy system, vo. 22, no. 5(2014).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anand Khandare .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer India

About this paper

Cite this paper

Khandare, A., Alvi, A.S. (2016). Survey of Improved k-means Clustering Algorithms: Improvements, Shortcomings and Scope for Further Enhancement and Scalability. In: Satapathy, S.C., Mandal, J.K., Udgata, S.K., Bhateja, V. (eds) Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol 434. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2752-6_48

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2752-6_48

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2750-2

  • Online ISBN: 978-81-322-2752-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics