Abstract
In this chapter, we carry out an empirical study of the performance of four representative evolutionary algorithm models considering two instance-selection perspectives, the prototype selection and the training set selection for data reduction in knowledge discovery. This study includes a comparison between these algorithms and other nonevolutionary instance-selection algorithms. The results show that the evolutionary instance-selection algorithms consistently outperform the nonevolutionary ones, offering two main advantages simultaneously, better instance-reduction rates and higher classification accuracy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Adriaans, P., and Zantinge, D., Data Mining, Addison-Wesley, 1996.
Back, T., Evolutionary Algorithms in Theory and Practice, Oxford, 1996.
Back, T., Fogel, D., and Michalewicz, Z., Handbook of Evolutionary Computation, Oxford University Press, 1997.
Babu, T. R., and Murty, M. N., Comparison of genetic algorithms based prototype selection schemes, Pattern Recognition, 34, 523–5, 2001.
Baluja, S., Population-based incremental learning, Technical Report CMU-CS-94-163. Carnegie Mellon University, 1994.
Brighton, H., and Mellish, C., Advances in instance selection for instance-based learning algorithms, Data Mining and Knowledge Discovery, 6, 153–72, 2002.
Brodley, C. E., Addressing the selective superiority problem: Automatic algorithm/model class selection, in Proc. 10th International Machine Learning Conference, Amherst, MA, 17–24, 1993.
Chapman, P., Clinton, J., Khabaza, T., Reinart, T., and Wirth, R., The CRISP-DM Process Model. www.crisp-dm.org, 1999.
Cover, T. M., and Hart, P. E., Nearest neighbor pattern classification, IEEE Transactions on Information Theory, 13, 21–7, 1967.
Dasarathy, B. V., Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques, IEEE Computer Society Press, Los Alamitos, CA, 1991.
Devijver, P. A., and Kittler, J., On the edited nearest neighbor rule, in Proc. 5th Internat. Conf. on Pattern Recognition, 72–80, 1980.
Eshelman, L. J., The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination, in Foundations of Genetic Algorithms-1, Rawlins, G. J. E., (ed.), Morgan Kauffman, 265–83, 1991.
Fogel, D. B., Evolutionary Computation: Toward a New Philosophy of Machine Intelligence, IEEE Press: Piscataway, 1995.
Frank, E., and Witten, I. H., Making better use of global discretization, in Proc. 16th International Conference on Machine Learning, Bratko, I., and Dzeroski, S., (eds.), Morgan Kaufmann, 115–23, 1999.
Gates, G. W., The reduced nearest neighbor rule, IEEE Transactions on Information Theory, 18(3), 431–3, 1972.
Goldberg, D. E., Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, New York, 1989.
Hart, P. E., The condensed nearest neighbor rule, IEEE Transaction on Information Theory, 14, 515–6, 1968.
Holland, J. H., Adaptation in Natural and Artificial Systems, The University of Michigan Press, 1975.
Ishibuchi, H., Nakashima, T., and Nii, M., Genetic-algorithm-based instance and feature selection, in Instance Selection and Construction for Data Mining, Kluwer Academic Publishers, 95–112, 2001.
Kibbler, D., and Aha, D. W., Learning representative exemplars of concepts: An initial case study, in Proc. 4th International Workshop on Machine Learning,. Irvine, CA, Morgan Kaufmann, 24–30, 1987.
Kuncheva, L. I., Editing for the k-nearest neighbors rule by a genetic algorithm, Pattern recognition Letters, 16, 809–14, 1995.
Kuncheva, L. I., Fitness functions in editing k-NN reference set by genetic algorithms, Pattern Recognition, 30(6), 1041–9, 1996.
Kuncheva, L. I., and Jain, L. C., Nearest neighbor classifier: Simultaneous editing and feature selection, Pattern Recognition Letters, 20, 1149–56, 1999.
Liu, H., and Motoda, H., Feature Selection for Knowledge Discovery and Data Mining, Kluwer Academic Publishers, 1998.
Liu, H., and Motoda, H., Instance Selection and Construction for Data Mining, Kluwer Academic Publishers, 2001.
Liu, H., and Motoda, H., Data reduction via instance selection, in Instance Selection and Construction for Data Mining, Liu, H., and Motoda, H., (eds.), Kluwer Academic Publishers, 3–20, 2001.
Liu, H., and Motoda, H., On issues of instance selection, Data Mining and Knowledge Discovery, 6, 115–30, 2002.
Nakashima, T., and Ishibuchi, H., GA-based approaches for finding the minimum reference set for nearest neighbor classification, in Proc. 1998 IEEE Conf. on Evolutionary Computation, IEEE Service Center, 709–14, 1998.
Pelikan, M., Goldberg, D. E., and Lobo, F., A survey of optimization by building and using probabilistic models, IlliGAL Report No.99018, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL, 1999.
Plutowski, M., Selecting Training Exemplars for Neural Network Learning, PhD Dissertation, University of California, San Diego, 1994.
Quinlan, J. R., C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo, CA, 1993.
Reeves, C. R., and Bush, D. R., Using genetic algorithms for training data selection in RBF networks, Liu, H., and Motoda, H., (eds.), Selection and Construction for Data Mining, Kluwer Academic Publishers, 339–56, 2001.
Reinartz, T., A unifying view in instance selection, Data Mining and Knowledge Discovery, 6, 191–210, 2002.
Shanahan, J. G., Soft Computing for Knowledge Discovery, Kluwer Academic Publishers, 2000.
Witten, I. H., and Frank, E., Data Mining, Morgan Kaufmann Publishers, 2000.
Whitley, D., The GENITOR algorithm and selective pressure: Why rank based allocation of reproductive trials is best, Proc. 3rd International Conference on GAs, Schaffer, J. D., (Ed.), Morgan Kaufman, 116–21, 2000.
Whitley, D., Rana, S., Dzubera, J., and Mathias, E., Evaluating evolutionary algorithms, Artificial Intelligence, 85, 245–76, 1996.
Wilson, D. L., Asymptotic properties of nearest neighbor rules using edited data, IEEE Trans. on Systems, Man, and Cybernetic, 2, 408–21, 1972.
Wilson, D. R., and Martinez, T. R., Reduction techniques for exemplar-based learning algorithms, Machine Learning, 38, 257–68, 2000.
Zhang, J., Selecting typical instances in instance-based learning, in Proc. 9th International Conference on Machine Learning, Sleeman, D., Edwards, P., (eds.), Morgan Kaufmann, 470–9, 1992.
Zhao, Q., and Higuchi, T., Minimization of nearest neighbor classifiers based on individual evolutionary algorithm, Pattern Recognition Letters, 17, 125–31, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag London Limited
About this chapter
Cite this chapter
Cano, J.R., Herrera, F., Lozano, M. (2005). Instance Selection Using Evolutionary Algorithms: An Experimental Study. In: Pal, N.R., Jain, L. (eds) Advanced Techniques in Knowledge Discovery and Data Mining. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/1-84628-183-0_5
Download citation
DOI: https://doi.org/10.1007/1-84628-183-0_5
Publisher Name: Springer, London
Print ISBN: 978-1-85233-867-1
Online ISBN: 978-1-84628-183-9
eBook Packages: Computer ScienceComputer Science (R0)