Instance Selection Using Evolutionary Algorithms: An Experimental Study

Cano, José Ramón; Herrera, Francisco; Lozano, Manuel

doi:10.1007/1-84628-183-0_5

José Ramón Cano³,
Francisco Herrera⁴ &
Manuel Lozano⁴

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

1217 Accesses
4 Citations

Abstract

In this chapter, we carry out an empirical study of the performance of four representative evolutionary algorithm models considering two instance-selection perspectives, the prototype selection and the training set selection for data reduction in knowledge discovery. This study includes a comparison between these algorithms and other nonevolutionary instance-selection algorithms. The results show that the evolutionary instance-selection algorithms consistently outperform the nonevolutionary ones, offering two main advantages simultaneously, better instance-reduction rates and higher classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adriaans, P., and Zantinge, D., Data Mining, Addison-Wesley, 1996.
Google Scholar
Back, T., Evolutionary Algorithms in Theory and Practice, Oxford, 1996.
Google Scholar
Back, T., Fogel, D., and Michalewicz, Z., Handbook of Evolutionary Computation, Oxford University Press, 1997.
Google Scholar
Babu, T. R., and Murty, M. N., Comparison of genetic algorithms based prototype selection schemes, Pattern Recognition, 34, 523–5, 2001.
Article Google Scholar
Baluja, S., Population-based incremental learning, Technical Report CMU-CS-94-163. Carnegie Mellon University, 1994.
Google Scholar
Brighton, H., and Mellish, C., Advances in instance selection for instance-based learning algorithms, Data Mining and Knowledge Discovery, 6, 153–72, 2002.
Article MATH MathSciNet Google Scholar
Brodley, C. E., Addressing the selective superiority problem: Automatic algorithm/model class selection, in Proc. 10th International Machine Learning Conference, Amherst, MA, 17–24, 1993.
Google Scholar
Chapman, P., Clinton, J., Khabaza, T., Reinart, T., and Wirth, R., The CRISP-DM Process Model. www.crisp-dm.org, 1999.
Google Scholar
Cover, T. M., and Hart, P. E., Nearest neighbor pattern classification, IEEE Transactions on Information Theory, 13, 21–7, 1967.
Article MATH Google Scholar
Dasarathy, B. V., Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques, IEEE Computer Society Press, Los Alamitos, CA, 1991.
Google Scholar
Devijver, P. A., and Kittler, J., On the edited nearest neighbor rule, in Proc. 5th Internat. Conf. on Pattern Recognition, 72–80, 1980.
Google Scholar
Eshelman, L. J., The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination, in Foundations of Genetic Algorithms-1, Rawlins, G. J. E., (ed.), Morgan Kauffman, 265–83, 1991.
Google Scholar
Fogel, D. B., Evolutionary Computation: Toward a New Philosophy of Machine Intelligence, IEEE Press: Piscataway, 1995.
Google Scholar
Frank, E., and Witten, I. H., Making better use of global discretization, in Proc. 16th International Conference on Machine Learning, Bratko, I., and Dzeroski, S., (eds.), Morgan Kaufmann, 115–23, 1999.
Google Scholar
Gates, G. W., The reduced nearest neighbor rule, IEEE Transactions on Information Theory, 18(3), 431–3, 1972.
Article Google Scholar
Goldberg, D. E., Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, New York, 1989.
MATH Google Scholar
Hart, P. E., The condensed nearest neighbor rule, IEEE Transaction on Information Theory, 14, 515–6, 1968.
Article Google Scholar
Holland, J. H., Adaptation in Natural and Artificial Systems, The University of Michigan Press, 1975.
Google Scholar
Ishibuchi, H., Nakashima, T., and Nii, M., Genetic-algorithm-based instance and feature selection, in Instance Selection and Construction for Data Mining, Kluwer Academic Publishers, 95–112, 2001.
Google Scholar
Kibbler, D., and Aha, D. W., Learning representative exemplars of concepts: An initial case study, in Proc. 4th International Workshop on Machine Learning,. Irvine, CA, Morgan Kaufmann, 24–30, 1987.
Google Scholar
Kuncheva, L. I., Editing for the k-nearest neighbors rule by a genetic algorithm, Pattern recognition Letters, 16, 809–14, 1995.
Article Google Scholar
Kuncheva, L. I., Fitness functions in editing k-NN reference set by genetic algorithms, Pattern Recognition, 30(6), 1041–9, 1996.
Article Google Scholar
Kuncheva, L. I., and Jain, L. C., Nearest neighbor classifier: Simultaneous editing and feature selection, Pattern Recognition Letters, 20, 1149–56, 1999.
Article Google Scholar
Liu, H., and Motoda, H., Feature Selection for Knowledge Discovery and Data Mining, Kluwer Academic Publishers, 1998.
Google Scholar
Liu, H., and Motoda, H., Instance Selection and Construction for Data Mining, Kluwer Academic Publishers, 2001.
Google Scholar
Liu, H., and Motoda, H., Data reduction via instance selection, in Instance Selection and Construction for Data Mining, Liu, H., and Motoda, H., (eds.), Kluwer Academic Publishers, 3–20, 2001.
Google Scholar
Liu, H., and Motoda, H., On issues of instance selection, Data Mining and Knowledge Discovery, 6, 115–30, 2002.
Article MathSciNet Google Scholar
Nakashima, T., and Ishibuchi, H., GA-based approaches for finding the minimum reference set for nearest neighbor classification, in Proc. 1998 IEEE Conf. on Evolutionary Computation, IEEE Service Center, 709–14, 1998.
Google Scholar
Pelikan, M., Goldberg, D. E., and Lobo, F., A survey of optimization by building and using probabilistic models, IlliGAL Report No.99018, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL, 1999.
Google Scholar
Plutowski, M., Selecting Training Exemplars for Neural Network Learning, PhD Dissertation, University of California, San Diego, 1994.
Google Scholar
Quinlan, J. R., C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo, CA, 1993.
Google Scholar
Reeves, C. R., and Bush, D. R., Using genetic algorithms for training data selection in RBF networks, Liu, H., and Motoda, H., (eds.), Selection and Construction for Data Mining, Kluwer Academic Publishers, 339–56, 2001.
Google Scholar
Reinartz, T., A unifying view in instance selection, Data Mining and Knowledge Discovery, 6, 191–210, 2002.
Article MATH MathSciNet Google Scholar
Shanahan, J. G., Soft Computing for Knowledge Discovery, Kluwer Academic Publishers, 2000.
Google Scholar
Witten, I. H., and Frank, E., Data Mining, Morgan Kaufmann Publishers, 2000.
Google Scholar
Whitley, D., The GENITOR algorithm and selective pressure: Why rank based allocation of reproductive trials is best, Proc. 3rd International Conference on GAs, Schaffer, J. D., (Ed.), Morgan Kaufman, 116–21, 2000.
Google Scholar
Whitley, D., Rana, S., Dzubera, J., and Mathias, E., Evaluating evolutionary algorithms, Artificial Intelligence, 85, 245–76, 1996.
Article Google Scholar
Wilson, D. L., Asymptotic properties of nearest neighbor rules using edited data, IEEE Trans. on Systems, Man, and Cybernetic, 2, 408–21, 1972.
Article MATH Google Scholar
Wilson, D. R., and Martinez, T. R., Reduction techniques for exemplar-based learning algorithms, Machine Learning, 38, 257–68, 2000.
Article MATH Google Scholar
Zhang, J., Selecting typical instances in instance-based learning, in Proc. 9th International Conference on Machine Learning, Sleeman, D., Edwards, P., (eds.), Morgan Kaufmann, 470–9, 1992.
Google Scholar
Zhao, Q., and Higuchi, T., Minimization of nearest neighbor classifiers based on individual evolutionary algorithm, Pattern Recognition Letters, 17, 125–31, 1996.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, Escuela Politecnica Superior de Linares, University of Jaén, 23700, Jaén, Spain
José Ramón Cano
Dept. of Computer Science and Artificial Intelligence, Escuela Tecnica Superior de Ingenieria Informatica, University of Granada, 18071, Granada, Spain
Francisco Herrera & Manuel Lozano

Authors

José Ramón Cano
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Herrera
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Lozano
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Electronics and Communication Sciences Unit, Indian Statistical Institute, India
Nikhil R. Pal
KES Center, University of South Australia, Australia
Lakhmi Jain

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cano, J.R., Herrera, F., Lozano, M. (2005). Instance Selection Using Evolutionary Algorithms: An Experimental Study. In: Pal, N.R., Jain, L. (eds) Advanced Techniques in Knowledge Discovery and Data Mining. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/1-84628-183-0_5

Download citation

DOI: https://doi.org/10.1007/1-84628-183-0_5
Publisher Name: Springer, London
Print ISBN: 978-1-85233-867-1
Online ISBN: 978-1-84628-183-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics