Genetic Clustering for Data Mining

  • Murilo Coelho Naldi
  • André C. P. L. F. de Carvalho
  • Ricardo José Gabrielli Barreto Campell
  • eduardo Raul Hruschka

Genetic Algorithms (GAs) have been successfully applied to several complex data analysis problems in a wide range of domains, such as image processing, bioinformatics, and crude oil analysis. The need for organizing data into categories of similar objects has made the task of clustering increasingly important to those domains. In this chapter, the authors present a survey of the use of GAs for clustering applications. A variety of encoding (chromosome representation) approaches, fitness functions, and genetic operators are described, all of them customized to solve problems in such an application context.


Genetic Algorithm Encode Scheme Mutation Operator Crossover Operator Genetic Operator 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Baker, J.E., (1987), Reducing bias and inefficiency in the selection algorithm. Proceedings of the Second International Conference on Genetic Algorithms and their Application, pp. 14-21.Google Scholar
  2. Bandyopadhyay, S., Maulik, U., (2001), Nonparametric genetic clustering: Comparison of validity indices. Systems, Man and Cybernetics, Part C, IEEE Transactions on : Applications and Reviews. 31(1): 120-125.Google Scholar
  3. Bandyopadhyay, S., Maulik, U., (2002), An evolutionary technique based on kmeans algorithm for optimal clustering in rn. Inf. Sci. Appl. 146(1-4): 221-237.zbMATHCrossRefMathSciNetGoogle Scholar
  4. Belew, R.K., Booker, L.B., (1991), eds., Solving Partitioning Problems with Genetic Algorithms. In Belew, R.K., Booker, L.B., eds.: ICGA, Morgan Kaufmann.Google Scholar
  5. Bezdek, J.C., Boggavaparu, S., Hall, L.O., (1994), Bensaid, A., Genetic algorithm guided clustering. Procedings of the First IEEE Conference on Evolutionary Computation: 34-40.Google Scholar
  6. Calinski, T., Harabasz, J., (1974), A dendrite method for cluster analysis. Communications in statistics 3(1): 1-27.CrossRefMathSciNetGoogle Scholar
  7. Casillas, A., de Lena, M.T.G., Martnez, R., (2003), Document clustering into an unknown number of clusters using a genetic algorithm. Lecture Notes in Computer Science 2807: 43-49.Google Scholar
  8. Cole, R.M., (1998), Clustering with Genetic Algorithms. PhD thesis, Department of Computer Science, University of Western Australia.Google Scholar
  9. Cowgill, M.C., Harvey, R.J., Watson, L.T., (1998), A genetic algorithm approach to cluster analysis. Technical report, Virginia Polytechnic Institute & State University, Blacksburg, VA, USA.Google Scholar
  10. Darwin, C., (2006), The Origin of Species: A Variorum Text. University of Pennsylvania Press.Google Scholar
  11. Davies, D., Bouldin, D.W., (1979), A cluster separation measure. IEEE Transactions of Pattern Analysis and Machine Intelligence 1: 224-227.CrossRefGoogle Scholar
  12. Duda, R., Hart, P., Stork, D., (2001), Pattern Classification. John Wiley & Sons.Google Scholar
  13. Dunn, J., (1973), A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. J. Cybern 3: 32-57.zbMATHCrossRefMathSciNetGoogle Scholar
  14. Estivill-Castro, V., (1997), Spatial clustering for data mining with genetic algorithms. Technical report, Australia.Google Scholar
  15. Everitt, B., Landau, S., Leese, M., (2001), Cluster Analysis, Arnold Publishers. Arnold Publishers.Google Scholar
  16. Fränti, P., Kivijärvi, J., Kaukoranta, T., Nevalainen, O., (1997), Genetic algorithms for large scale clustering problems. The Computer Journal 40: 547-554.CrossRefGoogle Scholar
  17. Freitas, A. (2005), Evolutionary Algorithms for Data Mining. in Oded Maimon, Lior Rokach (Eds.), The Data Mining and Knowledge Discovery Handbook, Springer, pp. 435-467.Google Scholar
  18. Goldberg, D., (1989), Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley.Google Scholar
  19. Greene, W.A., (2003), Unsupervised hierarchical clustering via a genetic algorithm. In: Proceedings of the 2003 Congress on Evolutionary Computation, IEEE Press, pp. 998-1005.Google Scholar
  20. Grefenstette, J., (2000), Proportional selection and sampling algorithms. In: Evolutionary Computation 1. Institute of physics publishing, pp. 172-180.Google Scholar
  21. Halkidi, M., Batistakis, Y., Vazirgiannis, M., (2001), On clustering validation techniques. Intelligent Information Systems Journal 17(2-3): 107-145.zbMATHCrossRefGoogle Scholar
  22. Hall, L., Ozyurt, B., Bezdek, J., (1999), Clustering with a genetically optimized approach. IEEE Transations on Evolutionary Computation. 3: 103-112.CrossRefGoogle Scholar
  23. Holland, J., (1975), Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor.Google Scholar
  24. Hruschka, E.R., Campello, R.J.G.B., de Castro, L.N., (2004), Improving the efficiency of a clustering genetic algorithm. In: Advances in Artificial Intelligence - IBERAMIA 2004: 9th Ibero-American Conference on AI, Puebla, Mexico, November 22-25. Proceedings. Volume 3315., Springer-Verlag GmbH, Lecture Notes in Computer Science, pp.861-868.Google Scholar
  25. Hruschka, E.R., Ebecken, N.F.F., (2003), A genetic algorithm for cluster analysis. Intelligent Data Analysis 7(1): 15-25.Google Scholar
  26. Hruschka, E.R., Ebecken, N.F.F., (2003), A feature selection bayesian approach for extracting classification rules with a clustering genetic algorithm. Applied Artificial Intelligence 17(5-6): 489-506.CrossRefGoogle Scholar
  27. Hruschka, E.R., Ebecken, N.F.F., (2006), Extracting rules from multilayer perceptrons in classification problems: A clustering-based approach. Neurocomputing 70: 384-397.CrossRefGoogle Scholar
  28. Jain, A.K., Murty, M.N., Flynn, P.J., (1999), Data clustering: a review. ACM Computing Surveys 31(3): 264-323.CrossRefGoogle Scholar
  29. Jain, A., Dubes, R., (1988), Algorithms for Clustering Data. Prentice Hall.Google Scholar
  30. Kaufman, L., Rousseeuw, P., (1990), Finding groups in data: An introduction to cluster analysis. Wiley Series in Probability and Mathematical Statistics.Google Scholar
  31. Kivijärvi, J., Fränti, P., Nevalainen, O., (2003), Self-adaptive genetic algorithm for clustering. Journal of Heuristics 9(2): 113-129.zbMATHCrossRefGoogle Scholar
  32. Korkmaz, E.E., Du, J., Alhajj, R., Barker, K., (2006), Combining advantages of new chromosome representation scheme and multi-objective genetic algorithms for better clustering. Intell. Data Anal. 10(2): 163-182.Google Scholar
  33. Krovi, R., (1992), Genetic algorithms for clustering: a preliminary investigation. System Sciences, 1992. Proceedings of the Twenty-Fifth Hawaii International Conference on 4: 540-544.Google Scholar
  34. Kuncheva, L., Bezdek, J.C., (1997), Selection of cluster prototypes from data by a genetic algorithm. Procedings of the 5th European Congress on Intelligent Techniques and Soft Computing, pp. 1683-1688.Google Scholar
  35. Liu, Y., Chen, K., Liao, X., Zhang, W., (2004), A genetic clustering method for intrusion detection. Pattern Recognition 37(5): 927-942.CrossRefGoogle Scholar
  36. Lucasius, C.B., Dane, A.D., Kateman, G., (1993), On k-medoid clustering of large data sets with the aid of a genetic algorithm: background, feasibility and comparison. Analytica Chimica Acta, pp. 647-669.Google Scholar
  37. Ma, P.C.H., Chan, K.C.C., Yao, X., Chiu, D.K.Y., (2006), An evolutionary clustering algorithm for gene expression microarray data analysis. IEEE Trans. Evolutionary Computations 10(3): 296-314.CrossRefGoogle Scholar
  38. Maulik, U., Bandyopadhyay, S., (2000), Genetic algorithm-based clustering technique. Pattern Recognition 33: 1455-1465.CrossRefGoogle Scholar
  39. Merz, P., Zell, A., (2002), Clustering gene expression profiles with memetic algorithms. In: PPSN VII: Proceedings of the 7th International Conference on Parallel Problem Solving from Nature, London, UK, Springer-Verlag, pp. 811-820.CrossRefGoogle Scholar
  40. Milligan, G.W., Cooper, M.C., (1985), An examination of procedures for determining the number of clusters in a data set. Psychometrika 50: 159-179.CrossRefGoogle Scholar
  41. Mitchell, M., (1999), An introduction to Genetic Algorithms. MIT Press.Google Scholar
  42. Murthy, C.A., Chowdhury, N., (1996), In search of optimal clusters using genetic algorithms. Pattern Recogn. Lett. 17(8): 825-832.Google Scholar
  43. Ohtsuka, A., Kamiura, N., Isokawa, T., Matsui, N., (2002), On detection of confused blood samples using self organizing maps and genetic algorithm. In: Neural Information Processing, 2002. ICONIP ’02. Proceedings of the 9th International Conference on. Volume 5., Department of Computer Science and Illinois Genetic Algorithms Laboratory, 2233 - 2238.Google Scholar
  44. Pal, N., Bezdek, J., (1995), On cluster validity for the fuzzy c-means model. IEEE Transactions of Fuzzy Systems 3(3):370-379.CrossRefGoogle Scholar
  45. Pan, H., Zhu, J., Han, D., (2003), Genetic algorithms applied to multi-class clustering for gene expression data. Genomics, Proteomics and Bioinformatics 1(4): 279-287.Google Scholar
  46. Rousseeuw, P.J., (1987), Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20:53-65.zbMATHCrossRefGoogle Scholar
  47. Scheunders, P., (1997), A genetic c-means clustering algorithm applied to color image quantization. Pattern Recognition 30(6): 859-866.CrossRefGoogle Scholar
  48. Sheng, W., Liu, X., (2004), A hybrid algorithm for k-medoid clustering of large data sets. In: Proceedings of the 2004 IEEE Congress on Evolutionary Computation, Portland, Oregon, IEEE Press, pp. 77-82.CrossRefGoogle Scholar
  49. Tseng, L., Yang, S.B., (2001), A genetic approach to the automatic clustering problem. Pattern Recognition 34:415-424.zbMATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Murilo Coelho Naldi
    • 1
  • André C. P. L. F. de Carvalho
    • 1
  • Ricardo José Gabrielli Barreto Campell
    • 1
  • eduardo Raul Hruschka
    • 1
  1. 1.Universidade de São PauloBrazil

Personalised recommendations