Skip to main content

A population-based clustering technique using particle swarm optimization and k-means

Abstract

A population-based clustering technique, which attempts to integrate different particle swarm optimizers (PSOs) with the famous k-means algorithm, is proposed. More specifically, six existing extensively studied PSOs, which have shown promising performance for continuous optimization, are hybridized separately with Lloyd’s k-means algorithm, leading to six PSO-based clustering methods. These PSO-based approaches use different social communications among neighbors to make some particles escape from local optima to enhance exploration, while k-means is utilized to refine the partitioning results for accelerating convergence. Comparative experiments on 12 synthetic and real-life datasets show that the proposed population-based clustering technique can obtain better and more stable solutions than five individual-based counterparts in most cases. Further, the effects of four different population topologies, three kinds of parameter settings, and two types of initialization methods on the clustering performance are empirically investigated. Moreover, seven boundary handling strategies for PSOs are firstly summarized. Finally, some unexpected conclusions are drawn from the experiments.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Notes

  1. 1.

    http://archive.ics.uci.edu/ml/.

  2. 2.

    http://en.wikipedia.org/wiki/Single-linkage_clustering.

  3. 3.

    http://www.mathworks.cn/cn/help/stats/hierarchical-clustering.html.

  4. 4.

    http://en.wikipedia.org/wiki/Single-linkage_clustering.

  5. 5.

    http://nlp.stanford.edu/IR-book/completelink.html.

  6. 6.

    http://lear.inrialpes.fr/~verbeek/software.

  7. 7.

    http://www.mathworks.cn/cn/help/stats/kmeans.html.

References

  1. Abbas A, Fakhri K, Mohamed SK (2010) Flocking based approach for data clustering. Nat Comput 9(3):767–794

    MathSciNet  MATH  Article  Google Scholar 

  2. Ahmed AAE, Rodrigo AC, Stan M (2013) A review on particle swarm optimization algorithm and its variants to clustering high-dimensional data. Artif Intell Rev 44(1):23–45

    Google Scholar 

  3. Alam S, Dobbie G, Riddle P, Naeem MA (1995) Particle swarm optimization based hierarchical agglomerative clustering. In: 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (WI-IAT), Toronto, ON, pp 64–68

  4. Alam S, Dobbie G, Riddle P (2008) An evolutionary particle swarm optimization algorithm for data clustering. In: Proceedings of IEEE swarm intelligence symposium, pp 1–6

  5. Average Link. http://nlp.stanford.edu/IR-book/completelink.html. Visited: 2014-09-16

  6. Bradley PS, Fayyad UM (1998) Refining initial points for k-means clustering. Microsoft Res http://research.microsoft.com/apps/pubs/default.aspx?id=68490, MSR-TR-98-36

  7. Cao H et al (2013) Cluster analysis based on attractor particle swarm optimization with boundary zoomed for working conditions classification of power plant pulverizing system. Neurocomputing 117:54–63

    Article  Google Scholar 

  8. Celebi ME, Kingravi HA, Vela PA (2013) A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst Appl 40(1):200–210

    Article  Google Scholar 

  9. Chen CY, Fun Y (2012) Particle swarm optimization algorithm and its application to clustering analysis. In: Proceedings of 17th conference on electrical power distribution networks (EPDC), pp 789–794

  10. Chen WN, Zhang J et al (2013) Particle swarm optimization with an aging leader and challengers. IEEE Trans Evol Comput 17(2):241–258

    MathSciNet  Article  Google Scholar 

  11. Chioua YC, Lan LW (2001) Genetic clustering algorithms. Eur J Oper Res 135(2):413–427

    MathSciNet  Article  Google Scholar 

  12. Chu W, Gao XG, Sorooshian S (2011) Handling boundary constraints for particle swarm optimization in high-dimensional search space. Inf Sci 128(20):4569–4581

    Article  Google Scholar 

  13. Chuang LY, Hsiao CJ, Yang CH (2011) Chaotic particle swarm optimization for data clustering. Expert Syst Appl 38(12):14555–14563

    Article  Google Scholar 

  14. Clerc M, Kennedy J (2002) The particle swarm–explosion, stability, and convergence in a multi-dimensional complex space. IEEE Trans Evol Comput 6(1):58–73

    Article  Google Scholar 

  15. Cohen SCM, Castro LN (2006) Data clustering with particle swarms. In: Proceedings of the IEEE congress on evolutionary computation, Vancouver, BC, pp 1792–1798

  16. Complete Link. http://en.wikipedia.org/wiki/Single-linkage_clustering. Visited: 2014-09-16

  17. Das S, Abraham A, Konar A (2008) Automatic clustering using an improved differential evolution algorithm. IEEE Trans Syst Man Cybern Part A Syst Hum 38(1):218–237

    Article  Google Scholar 

  18. Davies D, Bouldin D (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(2):224–227

    Article  Google Scholar 

  19. Eberhart RC, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the sixth symposium on micro machine and human science, Piscataway, NJ, pp 39–43

  20. Eberhart RC, Shi Y (2000) Comparing inertia weights and constriction factors in particle swarm optimization. In: Proceedings of the congress on evolutionary computation, San Diego, CA, pp 84–88

  21. Eberhart RC, Shi Y (2001) Particle swarm optimization: developments, applications and resources. In: Proceedings of the congress on evolutionary computation, Seoul, pp 81–86

  22. Flynn PJ, Murty MN, Jain AK (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323

    Article  Google Scholar 

  23. Hamid M, Saeed J, Seyed MHH (2012) Dynamic clustering using combinatorial particle swarm. Appl Intell 38(3):289–314

    Google Scholar 

  24. Handl J, Knowles J, Dorigo M (2006) Ant-based clustering and topographic mapping. Artif Life 12(1):35–62

    Article  Google Scholar 

  25. Hruschka ER, Campello RJGB, Freitas AA, Carvalho ACPLF (2009) A survey of evolutionary algorithms for clustering. IEEE Trans Syst Man Cybern Part C Appl Rev 39(2):133–155

    Article  Google Scholar 

  26. Huang T, Mohan AS (2005) A hybrid boundary condition for robust particle swarm optimization. IEEE Trans Antennas Wirel Propag Lett. http://epress.lib.uts.edu.au/research/bitstream/handle/10453/5871/2005003730.pdf?sequence=3

  27. Huang JZ, Ng MK, Rong H, Li Z (2005) Automated variable weighting in k-means type clustering. IEEE Trans Pattern Anal Mach Intell 27:657–668

    Article  Google Scholar 

  28. Hierarchical Clustering. http://www.mathworks.cn/cn/help/stats/hierarchical-clustering.html. Visited: 2014-09-16

  29. Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31(8):651–666

    Article  Google Scholar 

  30. Kao YT, Zahara E, Kao IW (2008) A hybridized approach to data clustering. Expert Syst Appl 34(3):1754–1762

    Article  Google Scholar 

  31. Karaboga D, Ozturk C (2011) A novel clustering approach-artificial bee colony (ABC) algorithm. Appl Soft Comput 11(1):652–657

    Article  Google Scholar 

  32. Kennedy J (2010) Particle swarm optimization. In: Encyclopedia of machine learning, Springer, pp 760–766

  33. Laszlo M, Mukherjee S (2007) A genetic algorithm that exchanges neighboring centers for k-means clustering. Pattern Recogn Lett 28(16):2359–2366

    Article  Google Scholar 

  34. Lee CY, Antonsson EK (2000) Dynamic partitional clustering using evolution strategies. In: 26th Annual conference of the IEEE industrial electronics society, Nagoya, pp 2716–2721

  35. Liang JJ, Qin AK, Suganthan PN, Baskar S (2006) Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Trans Evol Comput 10(3):281–295

    Article  Google Scholar 

  36. Lloyd SP (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137

    MathSciNet  MATH  Article  Google Scholar 

  37. Lloyd’ k-means Matlab Code. http://lear.inrialpes.fr/~verbeek/software. Visited: 2014-09-16

  38. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematics statistics and probability, vol 1, pp 281–296

  39. MacQueen’s k-means. http://www.mathworks.cn/cn/help/stats/kmeans.html. Visited: 2014-09-16

  40. Mendes B, Kennedy J, Neves J (2004) The fully informed particle swarm: simpler, maybe better. IEEE Trans Evol Comput 8(3):204–210

    Article  Google Scholar 

  41. Merwe DW, Engelbrecht AP (2003) Data clustering using particle swarm optimization. In: 2003 Congress on evolutionary computation (CEC 2003), vol 1, pp 215–220

  42. Milligan GW, Cooper MC (1985) An examination of procedures for determining the number of clusters in a data set. Psychometrika 50(2):159–179

    Article  Google Scholar 

  43. Mohamed JAH, Sivakumar R (2011) A survey: hybrid evolutionary algorithms for clustering analysis. Artif Intell Rev 36(3):179–204

    Article  Google Scholar 

  44. Mukhopadhyay A, Maulik U, Bandyopadhyay S, Coello CAA (2014a) Survey of multiobjective evolutionary algorithms for data mining: part I. IEEE Trans Evol Comput 18(1):4–19

    Article  Google Scholar 

  45. Mukhopadhyay A, Maulik U, Bandyopadhyay S, Coello CAA (2014b) Survey of multiobjective evolutionary algorithms for data mining: part II. IEEE Trans Evol Comput 18(1):20–35

    Article  Google Scholar 

  46. Murthy CA, Chowdhury N (1996) In search of optimal clusters using genetic algorithms. Pattern Recogn Lett 17(8):825–832

    Article  Google Scholar 

  47. Niknam T, Amiri B (2010) An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. Appl Soft Comput 10(1):183–197

    Article  Google Scholar 

  48. Niu B, Duan QQ, Liang J (2013) Hybrid bacterial foraging algorithm for data clustering. Lect Notes Comput Sci (IDEAL) 8206:577–584

    Article  Google Scholar 

  49. Omran MGH, Salman AA, Engelbrecht AP (2002) Image classification using particle swarm optimization. In: Proceedings of the Asia-Pacific conference on simulated evolution and learning, pp 370–374

  50. Omran MGH, Salman A, Engelbrecht AP (2005) Dynamic clustering using particle swarm optimization with application in image segmentation. Pattern Anal Appl 8(4):332–344

    MathSciNet  Article  Google Scholar 

  51. Peña JM, Lozano JA, Larrañaga P (1999) An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recogn Lett 20(10):1027–1040

    Article  Google Scholar 

  52. Pham DT, Dimov SS, Nguyen CD (2005) Selection of K in K-means clustering. http://www.ee.columbia.edu/~dpwe/papers/PhamDN05-kmeans.pdf

  53. PSC-RCE. http://www.mathworks.com/matlabcentral/fileexchange/38107-rapid-centroid-estimation] Matlab Code. Visited: 2014-09-16

  54. Radha T, Millie P, Ajith A, Pascal B (2011) Particle swarm optimization: hybridization perspectives and experimental illustrations. Appl Math Comput 217(12):5208–5226

    MATH  Google Scholar 

  55. Ratnaweera A, Halgamuge S, Watson HC (2004) Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients. IEEE Trans Evol Comput 8(3):240–255

    Article  Google Scholar 

  56. Robinson J, Samii YR (2004) Particle swarm optimization in electromagnetics. IEEE Trans Antennas Propag 52(2):397–407

    MathSciNet  Article  Google Scholar 

  57. Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

    MATH  Article  Google Scholar 

  58. Shi Y, Eberhart RC (1998) A modified particle swarm optimizer. In: Proceedings of IEEE congress on evolutionary computation, Anchorage, AK, pp 69–73

  59. Szabo A, Prior AKF, Castro LN (2010) The proposal of a velocity memoryless clustering swarm. In: Proceedings of IEEE congress on evolutionary computation, pp 1–5

  60. Single Link. http://en.wikipedia.org/wiki/Single-linkage_clustering. Visited: 2014-09-16

  61. Trelea IC (2003) The particle swarm optimization algorithm: convergence analysis and parameter selection. Inf Process Lett 85(3):317–325

    MathSciNet  MATH  Article  Google Scholar 

  62. Tsai CY, Kao IW (2011) Particle swarm optimization with selective particle regeneration for data clustering. Expert Syst Appl 38(6):6565–6576

    Article  Google Scholar 

  63. Tsai CW, Huang WK, Yang CS, Chiang MC (2014) A fast particle swarm optimization for clustering. Soft Comput 19(2):321–338

    Article  Google Scholar 

  64. Tzortzis G, Likas A (2014) The minmax k-means clustering algorithm. Pattern Recogn 47(7):2505–2516

    Article  Google Scholar 

  65. UCI Repository. http://archive.ics.uci.edu/ml/. Visited: 2014-09-16

  66. Yuwono M, Su SW, Moulton BD, Nguyen HT (2014) Data clustering using variants of rapid centroid estimation. IEEE Trans Evol Comput 18(3):366–377

    Article  Google Scholar 

  67. Zhang WJ, Xie XF (2003) DEPSO: hybrid particle swarm with differential evolution operator. In: IEEE International conference on systems, man and cybernetics, pp 3816–3821

  68. Zhang WJ, Xie XF, Bi DC (2004) Handling boundary constraints by PSO in periodic search space. In: Proceedings of the congress on evolutionary computation, pp 2307–2311

  69. Zhang H, Yang ZR, Oja E (2014) Improving cluster analysis by co-initializations. Pattern Recogn Lett 45(1):71–77

    Article  Google Scholar 

Download references

Acknowledgments

This work is partially supported by The National Natural Science Foundation of China (Grants Nos. 71571120, 71001072, 71271140, 71471158, 71461027, 61472257).

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Ben Niu or Lijing Tan.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Niu, B., Duan, Q., Liu, J. et al. A population-based clustering technique using particle swarm optimization and k-means. Nat Comput 16, 45–59 (2017). https://doi.org/10.1007/s11047-016-9542-9

Download citation

Keywords

  • Population-based clustering technique
  • Particle swarm optimization (PSO)
  • Lloyd’s k-means