Data Mining and Knowledge Discovery

, Volume 6, Issue 2, pp 153–172 | Cite as

Advances in Instance Selection for Instance-Based Learning Algorithms

  • Henry Brighton
  • Chris Mellish

Abstract

The basic nearest neighbour classifier suffers from the indiscriminate storage of all presented training instances. With a large database of instances classification response time can be slow. When noisy instances are present classification accuracy can suffer. Drawing on the large body of relevant work carried out in the past 30 years, we review the principle approaches to solving these problems. By deleting instances, both problems can be alleviated, but the criterion used is typically assumed to be all encompassing and effective over many domains. We argue against this position and introduce an algorithm that rivals the most successful existing algorithm. When evaluated on 30 different problems, neither algorithm consistently outperforms the other: consistency is very hard. To achieve the best results, we need to develop mechanisms that provide insights into the structure of class definitions. We discuss the possibility of these mechanisms and propose some initial measures that could be useful for the data miner.

instance-based learning instance selection forgetting pruning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aha, D.W., Kibler, D., and Albert, M.K. 1991. Instance based learning algorithms. Machine Learning, 6(1):37–66.Google Scholar
  2. Blake, C. and Merz, C. 1998. UCI repository of machine learning databases.Google Scholar
  3. Brighton, H. 1996. Experiments in case-based learning. Undergraduate Dissertation, Department of Artificial Intelligence, University of Edinburgh, Scotland.Google Scholar
  4. Brighton, H. 1997. Information filtering for lazy learning algorithms. Masters Thesis, Centre for Cognitive Science, University of Edinburgh, Scotland.Google Scholar
  5. Brighton, H. and Mellish, C. 1997. Geometric criteria for case deletion in case-based learning algorithms. Unpublished article.Google Scholar
  6. Brighton, H. and Mellish, C. 1999. On the consistency of information filters for lazy learning algorithms. In Principles of Data Mining and Knowledge Discovery, 3rd European Conference, Prague, Czech Republic, J.M. Zytkow and J. Rauch (Eds.), pp. 283–288.Google Scholar
  7. Brodley, C. 1993. Addressing the selective superiority problem: Automatic algorithm /mode class selection. In Proceedings of the Tenth International Machine Learning Conference, Amherst, MA, pp. 17–24.Google Scholar
  8. Cameron-Jones, R.M. 1992. Minimum description length instance-based learning. In Proceedings of the Fifth Australian Joint Conference on Artificial Intelligence, Hobart, Australia, pp. 368–373.Google Scholar
  9. Chang, C.-L. 1974. Finding prototypes for nearest neighbor classifiers. IEEE Transactions on Computers, C-23:1179–1184.Google Scholar
  10. Cover, T.M. and Hart, P.E. 1967. Nearest neighbor pattern classification. IEEE. Transactions on Information Theory, IT-13:21–27.Google Scholar
  11. Daelemans, W., van den Bosch, A., and Zavrel, J. 1997.Afeature-relevance heuristic for indexing and compressing large case bases. In Proceedings of the 9th European Conference on Machine Learning, Prague, Czech Republic, pp. 29–38.Google Scholar
  12. Daelemans, W., van den Bosch, A., and Zavrel, J. 1999. Forgetting exceptions is harmful in language learning. Machine Learning, 34(1/3): 11–41.Google Scholar
  13. Dasarathy, B. 1991. Nearest Neighbor (NN) norms: NN Pattern Classification Techniques. Los Alimos, CA: IEEE Computer Society Press.Google Scholar
  14. Domingos P. 1995. Rule induction and instance-based learning: A unified approach. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, Montreal, Canada, pp. 1226–1232.Google Scholar
  15. Gates, G.W. 1972. The reduced nearest neighbor rule. IEEE Transactions on Information Theory, 18(3):431–433.Google Scholar
  16. Hart, P.E. 1968. The condensed nearest neighbor rule. IEEE Transactions on Information Theory, 14(3):515–516.Google Scholar
  17. Holte, R.C., Acker, L., and Porter, B. 1989. Concept learning and problem of small disjuncts. In Proceedings of the 11th International Joint Conference on Artificial Intelligence, pp. 813–818.Google Scholar
  18. King, R.D., Feng, C., and Sutherland, A. 1995. Statlog: Comparison of classification algorithms on large real-world problems. Applied Artificial Intelligence, 9(3):289–333.Google Scholar
  19. Kolodner, J.L. 1993. Case-Based Reasoning. San Mateo, CA: Morgan Kaufmann.Google Scholar
  20. Markovitch, S. and Scott, P.D. 1988. The role of forgetting in learning. In Proceedings of the Fifth International Conference on Machine Learning, Ann Arbor, MI, pp. 459–465.Google Scholar
  21. Markovitch, S. and Scott, P.D. 1993. Information filtering: Selection mechanisms in learning systems. Machine Learning, 10(2):113–151.Google Scholar
  22. Ritter, G.L., Woodruff, H.B., Lowry, S.R., and Isenhour, T.L. 1975. An algorithm for the selective nearest neighbour decision rule. IEEE Transactions on Information Theory, 21(6):665–669.Google Scholar
  23. Salganicoff, M. 1993. Density-adaptive learning and forgetting. In Proceedings of the 10th International Conference on Machine Learning, University of Massachusetts, Amherst, pp. 276–283.Google Scholar
  24. Salzberg, S. 1991. A nearest hyperrectangle learning method. Machine Learning, 6:227–309.Google Scholar
  25. Sebban, M., Zighed, D.A., and Di Palma, S. 1999. Selection and statistical validation of features and prototypes. In Principles of Data Mining and Knowledge Discovery, 3rd European Conference. Prague, Czech Republic, J.M. Zytkow and J. Rauch (Eds.), pp. 184–192.Google Scholar
  26. Smyth, B. and Keane, M.T. 1995. Remembering to forget. In IJCAI-95, Proceedings of the Fourteenth International Conference on Artificial Intelligence, C.S. Mellish (Ed.), Vol. 1., pp. 377–382.Google Scholar
  27. Swonger, C.W. 1972. Sample set condensation for a condensed nearest neighbour decision rule for pattern recognition. In Frontiers of Pattern Recognition, S. Watanabe (Ed.), Orlando, FA: Academic Press, pp. 511–519.Google Scholar
  28. Tomek, I. 1976. An experiment with the edited nearest-neighbor rule. IEEE Transactions on Systems, Man, and Cybernetics, SMC-6(6):448–452.Google Scholar
  29. van den Bosch, A. and W. Daelemans, 1998. Do not forget: Full memory in memory-based learning of word pronunciation. In Proceedings of NeMLaP3/CoNLL98, Sydney, Australia, pp. 195–204.Google Scholar
  30. Wilson, D.L. 1972. Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man, and Cybernetics, SMC-2(3):408–421.Google Scholar
  31. Wilson, D.R. and Martinez, A.R. 1997. Instance pruning techniques. In Machine Learning: Proceedings of the Fourteenth International Conference, D. Fisher (Ed.). San Francisco, CA.Google Scholar
  32. Zhang, J. 1992. Selecting typical instances in instance-based learning. In proceedings of the Ninth International Machine Learning Conference, Aberdeen, Scotland, pp. 470–479.Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • Henry Brighton
    • 1
  • Chris Mellish
    • 2
  1. 1.Language Evolution and Computation Research Unit, Department of Theoretical and Applied LinguisticsThe University of EdinburghEdinburghUK
  2. 2.Department of Artificial IntelligenceThe University of EdinburghEdinburghUK

Personalised recommendations