Application of a Genetic Algorithm to Nearest Neighbour Classification

  • Semen Simkin
  • Tim Verwaart
  • Hans Vrolijk
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3533)

Abstract

This paper describes the application of a genetic algorithm to nearest-neighbour based imputation of sample data into a census data dataset. The genetic algorithm optimises the selection and weights of variables used for measuring distance. The results show that the measure of fit can be improved by selecting imputation variables using a genetic algorithm. The percentage of variance explained in the goal variables increases compared to a simple selection of imputation variables. This quantitative approach to the selection of imputation variables does not deny the importance of expertise. Human expertise is still essential in defining the optional set of imputation variables.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Vrolijk, H.C.J.: STARS: statistics for regional studies. In: Poppe, K.J. (ed.) Proc. of Pacioli 11 New roads for farm accounting and FADN, LEI, The Hague (2004) ISBN 90-5242-878-6Google Scholar
  2. 2.
    Ramos, V., Muge, F.: Less is More: Genetic Optimisation of Nearest Neighbour Classifiers. In: Muge, F., Pinto, C., Piedade, M. (eds.) Proc. of RecPad 1998, Lisbon (1998) ISBN 972-97711-0-3Google Scholar
  3. 3.
    Stone, M.: Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society B(36), 111–147 (1974)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Semen Simkin
    • 1
  • Tim Verwaart
    • 1
  • Hans Vrolijk
    • 1
  1. 1.Lei Wageningen URden HaagNetherlands

Personalised recommendations