A Genetic Programming Approach to Data Clustering

  • Chang Wook Ahn
  • Sanghoun Oh
  • Moonyoung Oh
Part of the Communications in Computer and Information Science book series (CCIS, volume 263)


This paper presents a genetic programming (GP) to data clustering. The aim is to accurately classify a set of input data into their genuine clusters. The idea lies in discovering a mathematical function on clustering regularities and then utilize the rule to make a correct decision on the entities of each cluster. To this end, GP is incorporated into the clustering procedures. Each individual is represented by a parsing tree on the program set. Fitness function evaluates the quality of clustering with regard to similarity criteria. Crossover exchanges sub-trees between parental candidates in a positionally independent fashion. Mutation introduces (in part) a new sub-tree with a low probability. The variation operators (i.e., crossover, mutation) offer an effective search capability to obtain the improved quality of solution and the enhanced speed of convergence. Experimental results demonstrate that the proposed approach outperforms a well-known reference.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31(3) (September 1999)Google Scholar
  2. 2.
    Han, J., Kamber, M.: Data mining: Concepts and techniques. Morgan Kaufmann Publishers, San Francisco (2001)zbMATHGoogle Scholar
  3. 3.
    Park, N.H., Ahn, C.W., Ramakrishna, R.S.: Adaptive Clustering Technique Using Genetic Algorithms. IEICE Trans. Inf. and Syst. E88-D(12), 2880–2882 (2005)CrossRefGoogle Scholar
  4. 4.
    Koza, J.R.: Genetic Programming On the programming of Computers by Means of Natural Selection. The MIT Press (1992)Google Scholar
  5. 5.
    Langdon, W.B.: Genetic Programming + Data Structures = Automatic Programming. The Kluwer International Series in Engineering and Computer Science. Kluwer Academic Publishers (1998)Google Scholar
  6. 6.
    Mitchell, T.M.: Machine Learning. Computer Science Series. McGRAW-HILL International Editions (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Chang Wook Ahn
    • 1
  • Sanghoun Oh
    • 1
  • Moonyoung Oh
    • 2
  1. 1.School of Information & Communication EngineeringSungkyunkwan UniversitySuwonKorea
  2. 2.Department of Medical AdministrationBusan College of Information TechnologyKorea

Personalised recommendations