Soft Computing

, Volume 20, Issue 10, pp 3835–3849 | Cite as

Towards improved benchmarking of black-box optimization algorithms using clustering problems

  • Marcus GallagherEmail author


The field of Metaheuristics has produced a large number of algorithms for continuous, black-box optimization. In contrast, there are few standard benchmark problem sets, limiting our ability to gain insight into the empirical performance of these algorithms. Clustering problems have been used many times in the literature to evaluate optimization algorithms. However, much of this work has occurred independently on different problem instances and the various experimental methodologies used have produced results which are frequently incomparable and provide little knowledge regarding the difficulty of the problems used, or any platform for comparing and evaluating the performance of algorithms. This paper discusses sum of squares clustering problems from the optimization viewpoint. Properties of the fitness landscape are analysed and it is proposed that these problems are highly suitable for algorithm benchmarking. A set of 27 problem instances (from 4-D to 40-D), based on three well-known datasets, is specified. Baseline experimental results are presented for the Covariance Matrix Adaptation-Evolution Strategy and several other standard algorithms. A web-repository has also been created for this problem set to facilitate future use for algorithm evaluation and comparison.


Algorithm benchmarking Continuous black-box optimization Clustering Fitness landscape analysis 


Compliance with ethical standards

Conflict of interest

The author declares that he has no conflict of interest.


  1. Berthier V (2015) Progressive differential evolution on clustering real world problems. In: Artificial evolution 2015, EA 2015—international conference on artificial evolution. Springer, Lyon.
  2. Blake C, Keogh E, Merz C (1998) UCI repository of machine learning databases. Retrieved from
  3. Brimberg J, Hansen P, Mladenovic N, Taillard ED (2000) Improvements and comparison of heuristics for solving the uncapacitated multisource Weber problem. Oper Res 48(3):444–460CrossRefGoogle Scholar
  4. Chang DX, Zhang XD, Zheng CW (2009) A genetic algorithm with gene rearrangement for k-means clustering. Pattern Recognit 42(7):1210–1222CrossRefGoogle Scholar
  5. Du Merle O, Hansen P, Jaumard B, Mladenovic N (2000) An interior point algorithm for minimum sum-of-squares clustering. SIAM J Sci Comput 21(4):1485–1505MathSciNetCrossRefzbMATHGoogle Scholar
  6. Fathian M, Amiri B, Maroosi A (2007) Application of honey-bee mating optimization algorithm on clustering. Appl Math Comput 190(2):1502–1513MathSciNetzbMATHGoogle Scholar
  7. Gallagher M (2000) Multi-layer perceptron error surfaces: visualization, structure and modelling. PhD thesis, Department of Computer Science and Electrical Engineering, University of QueenslandGoogle Scholar
  8. Gallagher M (2014) Clustering problems for more useful benchmarking of optimization algorithms. In: Simulated evolution and learning, (SEAL 2014). Springer, pp 131–142Google Scholar
  9. Hatamlou A (2013) Black hole: a new heuristic optimization approach for data clustering. Inf Sci 222:175–184MathSciNetCrossRefGoogle Scholar
  10. Hecht-Nielsen R (1990) Neurocomputing. Addison-Wesley, ReadingGoogle Scholar
  11. Hooker JN (1996) Testing heuristics: we have it all wrong. J Heuristics 1:33–42CrossRefzbMATHGoogle Scholar
  12. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Sur 31(3):264–323CrossRefGoogle Scholar
  13. Kanade PM, Hall LO (2007) Fuzzy ants and clustering. Syst Man Cybern Part A: IEEE Trans Syst Hum 37(5):758–769CrossRefGoogle Scholar
  14. Kao Y, Cheng K (2006) An ACO-based clustering algorithm. In: Ant colony optimization and swarm intelligence (ANTS 2006). Springer, Berlin, pp 340–347Google Scholar
  15. Likas A, Vlassis N, Verbeek JJ (2003) The global k-means clustering algorithm. Pattern Recognit 36(2):451–461CrossRefGoogle Scholar
  16. Liu R, Shen Z, Jiao L, Zhang W (2010) Immunodominance based clonal selection clustering algorithm. In: 2010 IEEE Congress on Evolutionary Computation (CEC), pp 1–7Google Scholar
  17. Macready W, Wolpert, D (1996) What makes an optimization problem hard? Technical Report. SFI-TR-95-05-046, The Santa Fe InstituteGoogle Scholar
  18. Maulik U, Bandyopadhyay S (2000) Genetic algorithm-based clustering technique. Pattern Recognit 33(9):1455–1465CrossRefGoogle Scholar
  19. McGeoch CC (2002) Experimental analysis of optimization algorithms. In: Pardalos PM, Resende M (eds) Handbook of applied optimization, chap 24. Oxford University Press, Oxford, pp 1044–1052Google Scholar
  20. Pena JM, Lozano JA, Larranaga P (1999) An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognit Lett 20(10):1027–1040CrossRefGoogle Scholar
  21. Rardin RL, Uzsoy R (2001) Experimental evaluation of heuristic optimization algorithms: a tutorial. J Heuristics 7:261–304CrossRefzbMATHGoogle Scholar
  22. Salhi S, Gamal MDH (2003) A genetic algorithm based approach for the uncapacitated continuous location-allocation problem. Ann Oper Res 123:230–222MathSciNetCrossRefzbMATHGoogle Scholar
  23. Shelokar P, Jayaraman VK, Kulkarni BD (2004) An ant colony approach for clustering. Anal Chim Acta 509(2):187–195CrossRefGoogle Scholar
  24. Steinley D (2006) K-means clustering: a half-century synthesis. Br J Math Stat Psychol 59:1–34MathSciNetCrossRefGoogle Scholar
  25. Stephens M (2000) Dealing with label switching in mixture models. J R Stat Soc (B) 62(4):795–809MathSciNetCrossRefzbMATHGoogle Scholar
  26. Taherdangkoo M, Hossein Shirzadi M, Yazdi M, Hadi Bagheri M (2013) A robust clustering method based on blind, naked mole-rats (bnmr) algorithm. Swarm Evolut Comput 10:1–11CrossRefGoogle Scholar
  27. Vattani A (2011) k-means requires exponentially many iterations even in the plane. Discret Comput Geom 45(4):596–616MathSciNetCrossRefzbMATHGoogle Scholar
  28. Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82CrossRefGoogle Scholar
  29. Xavier AE (2010) The hyperbolic smoothing clustering method. Pattern Recognit 43(3):731–737CrossRefzbMATHGoogle Scholar
  30. Xiang WL, Zhu N, Ma SF, Meng XL, An MQ (2015) A dynamic shuffled differential evolution algorithm for data clustering. Neurocomputing 158:144–154CrossRefGoogle Scholar
  31. Xu R, Wunsch D II (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678Google Scholar
  32. Ye F, Chen CY (2005) Alternative kpso-clustering algorithm. Tamkang J Sci Eng 8(2):165Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.School of Information Technology and Electrical EngineeringThe University of QueenslandBrisbaneAustralia

Personalised recommendations