Skip to main content
Log in

Chameleon algorithm based on mutual k-nearest neighbors

  • Original Paper
  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Clustering is a typical unsupervised data analysis method, which divides a given data set without label information into multiple clusters. The data on each cluster has a great deal of association, which can be used as the preprocessing stage of other algorithms or for further association analysis. Therefore, clustering plays an important role in a wide range of fields. Chameleon is a clustering algorithm that combines the relative interconnectivity and relative closeness to find clusters of arbitrary shape with high quality. However, the graph-partitioning technology hMETIS algorithm used in the algorithm is difficult to operate and easy to cause uncertainty of results. In addition, the final number of clusters need to be specified by user as a parameter to stop merging, which is difficult to determine without prior information. Aiming at these shortcomings, Chameleon algorithm based on mutual k-nearest neighbors (MChameleon) is proposed. Firstly, the idea of mutual k-nearest neighbors is introduced to directly generate sub-clusters, which omits the process of partitioning graph. Then, the concept of MC modularity is introduced, which is used to objectively identify the final clustering results. By experiments on artificial data sets and UCI data sets, we compared MChameleon with the original Chameleon algorithm, the improved AChameleon algorithm and the classic K-Means, DBSCAN, BIRCH algorithm in accuracy. Experimental results on data sets show that Chameleon algorithm based on mutual k-nearest neighbors has great advantages and is feasible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Zanin M, Papo D, Sousa PA et al (2016) Combining complex networks and data mining: why and how. Phys Rep 635:1–44

    Article  MathSciNet  Google Scholar 

  2. Du M, Ding S, Jia H (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl-Based Syst 99:135–145

    Article  Google Scholar 

  3. Nguyen HL, Woon YK, Ng WK (2015) A survey on data stream clustering and classification. Knowl Inf Syst 45(3):535–569

    Article  Google Scholar 

  4. Xu X, Ding S, Xu H et al (2018) A feasible density peaks clustering algorithm with a merging strategy. Soft Comput 23(13):5171–5183

    Article  Google Scholar 

  5. Khanmohammadi S, Adibeig N, Shanehbandy S (2017) An improved overlapping k-means clustering method for medical applications. Expert Syst Appl 67:12–18

    Article  Google Scholar 

  6. Yu Z, Li L, Liu J, Zhang J, Han G (2015) Adaptive noise immune cluster ensemble using affinity propagation. IEEE Trans Knowl Data Eng 27(12):3176–3189

    Article  Google Scholar 

  7. Morris K, McNicholas PD (2016) Clustering, classification, discriminant analysis, and dimension reduction via generalized hyperbolic mixtures. Comput Stat and Data An 97:133–150

    Article  MathSciNet  Google Scholar 

  8. Huang D, Wang CD, Lai JH (2018) Locally weighted ensemble clustering. IEEE T Cybernetics 48(5):1460–1473

    Article  Google Scholar 

  9. Han J, Micheline K (2006) Data mining: concepts and techniques. data mining concepts models methods & algorithms second edition, 5(4), pp 1–18

  10. Fan SY, Ding SF, Xue Y (2018) Self-adaptive kernel K-means algorithm based on the shuffled frog leaping algorithm. Soft Comput 22(3):861–872

    Article  Google Scholar 

  11. Galan SF (2019) Comparative evaluation of region query strategies for DBSCAN clustering. Inf Sci 502:76–90

    Article  MathSciNet  Google Scholar 

  12. Wu B, Wilamowski BM (2017) A fast density and grid based clustering method for data with arbitrary shapes and noise. IEEE T Ind Inform 13(4):1620–1628

    Article  Google Scholar 

  13. Gorricha J, Lobo V (2012) Improvements on the visualization of clusters in geo-referenced data using self-organizing maps. Comput Geosci 43:177–186

    Article  Google Scholar 

  14. Ros F, Guillaume S (2019) A hierarchical clustering algorithm and an improvement of the single linkage criterion to deal with noise. Expert Syst Appl 128:96–108

    Article  Google Scholar 

  15. Bouguettaya A, Yu Q, Liu X, Zhou X, Song A (2015) Efficient agglomerative hierarchical clustering. Expert Syst Appl 42(5):2785–2797

    Article  Google Scholar 

  16. Karypis G, Han EH, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8):68–75

    Article  Google Scholar 

  17. Xue W, Liu P, Liu D (2012) Improved Chameleon algorithm using weighted nearest neighbors graph. Journal of Computer Applications 10:208–211

    Google Scholar 

  18. Karypis G, Aggarwal R, Kumar V, Shekhar S (1999) Multilevel hypergraph partitioning: applications in VLSI domain. IEEE T VLSI Syst 7(1):69–79

    Article  Google Scholar 

  19. Guo D, Zhao J, Liu J (2019) Research and application of improved CHAMELEON algorithm based on condensed hierarchical clustering method. In: Proceedings of the 2019 8th international conference on networks, communication and computing. Association for Computing Machinery, Luoyang, pp 14–18

  20. Zhang W, Li J (2015) Extended fast search clustering algorithm: widely density clusters, no density peaks. Comput SciInf Technol 5(7):1–17

    Google Scholar 

  21. Barton T, Bruna T, Kordik P (2019) Chameleon 2: an improved graph-based clustering algorithm. ACM Trans Knowl Discov Data 13(1):1–27

    Article  Google Scholar 

  22. Wang L, Dai G, Zhao H (2010) Research on modularity for evaluating community structure. Comput Eng 36(14):227–229

    Google Scholar 

  23. Garruzzo S, Rosaci D (2008) Agent clustering based on semantic negotiation. ACM T Auton Adap Sys 3(2):1–40

    Google Scholar 

  24. Fan J, Jia P, Ge L (2019) Mk-NNG-DPC: density peaks clustering based on improved mutual K-nearest-neighbor graph. Int J Mach Learn Cybern 11(6):1179–1195

    Article  Google Scholar 

  25. Liu H, Zhang S (2012) Noisy data elimination using mutual k-nearest neighbor for classification mining. J Syst Softw 84(5):1067–1074

    Article  Google Scholar 

  26. Newman M, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113

    Article  Google Scholar 

  27. Blondel VD, Guillaume JL, Lambiotte R et al (2008) Fast unfolding of communities in large networks. J Stat Mech-Theory E:10008

  28. Kong B, Zhou L, Liu W (2012) Improved modularity based on Girvan-Newman modularity. In: 2012 second international conference on intelligent system design and Engineering application. IEEE, Sanya, pp 293–296

  29. Xu X, Ding S, Shi Z (2018) An improved density peaks clustering algorithm with fast finding cluster centers. Knowl-Based Syst 158:65–74

    Article  Google Scholar 

  30. Xu TS, Chiang HD, Liu GY, Tan CW (2017) Hierarchical K-means method for clustering large-scale advanced metering infrastructure data. IEEE TPower Deliver 32(2):609–616

    Article  Google Scholar 

  31. Madan S, Dana KJ (2016) Modified balanced iterative reducing and clustering using hierarchies (m-BIRCH) for visual clustering. Pattern Anal Appl 19(4):1023–1040

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundations of China (no.61672522, and no.61976216).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shifei Ding.

Ethics declarations

Conflict of interest

The authors declared that we have no conflicts of interest to this work.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Ding, S., Wang, L. et al. Chameleon algorithm based on mutual k-nearest neighbors. Appl Intell 51, 2031–2044 (2021). https://doi.org/10.1007/s10489-020-01926-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01926-7

Keywords

Navigation