An Investigation of Representations and Operators for Evolutionary Data Clustering with a Variable Number of Clusters

  • Julia Handl
  • Joshua Knowles
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4193)


This paper analyses the properties of four alternative representation/operator combinations suitable for data clustering algorithms that keep the number of clusters variable. These representations are investigated in the context of their performance when used in a multiobjective evolutionary clustering algorithm (MOCK), which we have described previously. To shed light on the resulting performance differences observed, we consider the relative size of the search space and heuristic bias inherent to each representation, as well as its locality and heritability under the associated variation operators. We find that the representation that performs worst when a random initialization is employed, is nevertheless the best overall performer given the heuristic initialization normally used in MOCK. This suggests there are strong interaction effects between initialization, representation and operators in this problem.


Pareto Front Data Item Cluster Solution Cluster Membership Initialization Scheme 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cole, R.M.: Clustering with genetic algorithms. Master’s thesis, University of Western Australia, Australia (1998)Google Scholar
  2. 2.
    Corne, D.W., Knowles, J.D., Oates, M.J.: PESA-II: Region-based selection in evolutionary multiobjective optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 283–290. ACM Press, New York (2001)Google Scholar
  3. 3.
    Falkenauer, E.: Genetic Algorithms and Grouping Problems. John Wiley and Son Ltd., New York (1998)Google Scholar
  4. 4.
    Goulden, I.P., Jackson, D.M.: Combinatorial Enumeration, p. 192, 3.3.28. John Wiley and Sons, Inc, New York (1983)zbMATHGoogle Scholar
  5. 5.
    Handl, J., Knowles, J.: Evolutionary multiobjective clustering. In: Proceedings of the Eighth International Conference on Parallel Problem Solving from Nature, pp. 1081–1091. Springer, Berlin, Germany (2004)CrossRefGoogle Scholar
  6. 6.
    Handl, J., Knowles, J.: An evolutionary approach to multiobjective clustering. IEEE Transactions on Evolutionary Computation (in press, 2006)Google Scholar
  7. 7.
    Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)zbMATHGoogle Scholar
  8. 8.
    Ma, P.C.H., Chan, K.C.C., Yao, X., Chiu, D.K.Y.: An evolutionary clustering algorithm for gene expression microarray data analysis. IEEE Transactions on Evolutionary Computation (in press, 2006)Google Scholar
  9. 9.
    MacQueen, L.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)Google Scholar
  10. 10.
    Park, Y.-J., Song, M.-S.: A genetic algorithm for clustering problems. In: Proceedings of the Third Annual Conference on Genetic Programming, pp. 568–575. Morgan Kaufmann, San Francisco (1998)Google Scholar
  11. 11.
    Radcliffe, N.J., Surry, P.D.: Fitness variance of formae and performance prediction. In: Foundations of Genetic Algorithms, vol. 3, pp. 51–72. Morgan Kaufmann Publishers, San Mateo (1995)Google Scholar
  12. 12.
    Raidl, G.R., Gottlieb, J.: Empirical analysis of locality, heritability and heuristic bias in evolutionary algorithms: A case study for the multidimensional knapsack problem. Evolutionary Computation 13(4), 441–475 (2005)CrossRefGoogle Scholar
  13. 13.
    Rothlauf, F., Goldberg, D.E.: Redundant representations in evolutionary computation. evolutionary computation 11(4), 381–415 (2003)CrossRefGoogle Scholar
  14. 14.
    Sloane, N.J.A.: Series A060281 in The On-Line Encyclopedia of Integer SequencesGoogle Scholar
  15. 15.
    Zitzler, E.: Evolutionary algorithms for multiobjective optimization: methods and applications. PhD thesis, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Julia Handl
    • 1
  • Joshua Knowles
    • 1
  1. 1.Manchester Interdisciplinary BiocentreUniversity of ManchesterUK

Personalised recommendations