Machine Learning

, Volume 6, Issue 1, pp 37–66 | Cite as

Instance-Based Learning Algorithms

  • David W. Aha
  • Dennis Kibler
  • Marc K. Albert
Article

Abstract

Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to solve incremental learning tasks. In this paper, we describe a framework and methodology, called instance-based learning, that generates classification predictions using only specific instances. Instance-based learning algorithms do not maintain a set of abstractions derived from specific instances. This approach extends the nearest neighbor algorithm, which has large storage requirements. We describe how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy. While the storage-reducing algorithm performs well on several real-world databases, its performance degrades rapidly with the level of attribute noise in training instances. Therefore, we extended it with a significance test to distinguish noisy instances. This extended algorithm's performance degrades gracefully with increasing noise levels and compares favorably with a noise-tolerant decision tree algorithm.

Supervised concept learning instance-based concept descriptions incremental learning learning theory noise similarity 

References

  1. Aha, D.W. (1989a) Incremental, instance-based learning of independent and graded concept descriptions.Proceed-ings of the Sixth International Workshop on Machine Learning (pp.387-391) Ithaca, NY: Morgan Kaufmann.Google Scholar
  2. Aha, D.W. (1989b) Incremental learning of independent, overlapping, and graded concepts with an instance-based process framework (Technical Report 89-10) Irvine, CA: University of California, Irvine, Department of Information and Computer Science.Google Scholar
  3. Aha, D.W. (1989c) Tolerating noise, irrelevant attributes, and novel attributes in instance-based learning algorithms. Proceedings of the IJCAI-1989 Workshop on Symbolic Problem Solving in Noisy, Novel, and Uncertain Task Environments.Detroit, MI: Computing Research Laboratory, New Mexico State University.Google Scholar
  4. Aha, D.W., & Kibler, D. (1989) Noise-tolerant instance-based learning algorithms.Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (pp.794-799) Detroit, MI: Morgan Kaufmann.Google Scholar
  5. Bareiss, E.R., Porter, B., & Wier, C.C. (1987) PROTOS:An exemplar-based learning apprentice.Proceedings of the Fourth International Workshop on Machine Learning (pp.12-23) Irvine, CA: Morgan Kaufmann.Google Scholar
  6. Barsalou, L.W. (1983) Ad hoc categories.Memory and Cognition, 11,211-227.Google Scholar
  7. Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. (1986) Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension.Proceedings of the Eighteenth Annual Association for Computing Machinery Symposium on Theory of Computing (pp.273-282) Berkeley, CA: Association for Computing Machinery.Google Scholar
  8. Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J. (1984) Classification and regression trees.Belmont, CA: Wadsworth International Group.Google Scholar
  9. Bradshaw, G. (1987) Learning about speech sounds:The NEXUS project.Proceedings of the Fourth International Workshop on Machine Learning (pp.1-11) Irvine, CA: Morgan Kaufmann.Google Scholar
  10. Brooks, L. (1978) Nonanalytic concept formation and memory for instances.In E. Rosch & B.B. Lloyd (Eds.), Cognition and categorization.Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
  11. Brooks, L. (1989) Concept formation and particularizing learning.In S. Hanson & R. Davis (Eds.), Information, language, & cognition:Vancouver studies in cognitive science (Vol.I) Vancouver, B.C.: UBC Press.Google Scholar
  12. Cestnik, B., Kononenko, I., & Bratko, I. (1987) ASSISTANT-86:A knowledge-elicitation tool for sophisticated users.In I. Bratko & N. Lavrac (Eds.), Progress in machine learning.Bled, Yugoslavia: Sigma Press.Google Scholar
  13. Clark, P.E. (1989) Exemplar-based reasoning in geological prospect appraisal (Technical Report 89-034) Glasgow, Scotland: Turing Institute.Google Scholar
  14. Clark, P.E., & Niblett, T. (1989) The CN2 induction algorithm.Machine Learning, 3, 261-284.Google Scholar
  15. Cover, T.M., & Hart, P.E. (1967) Nearest neighbor pattern classification.Institute of Electrical and Electronics Engineers Transactions on Information Theory, 13, 21-27.Google Scholar
  16. Dasarathy, B.V. (1980) Nosing around the neighborhood:A new system structure and classification rule for recogni-tion in partially exposed environments.Pattern Analysis and Machine Intelligence, 2, 67-71.Google Scholar
  17. Detrano, R., M.D. (1988) International application of a new probability algorithm for the diagnosis of coronary artery disease.Unpublished manuscript.Google Scholar
  18. Dietterich, T.G., & Michalski, R.S. (1983) A comparative review of selected methods for learning from examples. In R.S. Michalski, J.G. Carbonell, & T.M.Mitchell (Eds.), Machine learning:An artificial intelligence approach. San Mateo, CA: Morgan Kaufmann.Google Scholar
  19. Fisher, D.F. (1989) Noise-tolerant concept clustering.Proceedings of the Eleventh International Conference on Artificial Intelligence (pp.825-830) Detroit, MI: Morgan Kaufmann.Google Scholar
  20. Gates, G.W. (1972) The reduced nearest neighbor rule.IEEE Transactions on Information Theory, 431-433.Google Scholar
  21. Hart, P.E. (1968) The condensed nearest neighbor rule.Institute of Electrical and Electronics Engineers and Transactions on Information Theory, 14,515-516.Google Scholar
  22. Hintzman, D.L. (1986) "Schema abstraction"in a multiple-trace memory model.Psychological Review, 93,411-428.Google Scholar
  23. Hogg, R.V., & Tanis, E.A. (1983) Probability and statistical inference.New York, NY: Macmillan.Google Scholar
  24. Jabbour, K., Riveros, J.F.V., Landsbergen, D., & Meyer, W. (1987) ALFA:Automated load forecasting assistant. Proceedings of the 1987 IEEE Power Engineering Society Summer Meeting.San Francisco, CA.Google Scholar
  25. Kibler, D, & Aha, D.W. (1988) Case-based classification.Proceedings of the Case-Based Reasoning Workshop at AMI 1988 (pp.62-67) Unpublished manuscript.Google Scholar
  26. Kibler, D., & Aha, D.W. (1989) Comparing instance-saving with instance-averaging learning algorithms.In D.P. Benjamin (Ed.), Change of representation and inductive bias.Norwell, MA: Kluwer Academic Publishers.Google Scholar
  27. Kibler, D, Aha, D.W., & Albert, M. (1989) Instance-based prediction of real-valued attributes.Computational Intelligence, 5,51-57.Google Scholar
  28. Koton, P. (1988) Reasoning about evidence in causal explanations.Proceedings of the Seventh National Confer-ence on Artificial Intelligence (pp.256-261) St.Paul, MN: Morgan Kaufmann.Google Scholar
  29. Markovitch, S., & Scott, P.D. (1989) Information filters and their implementation in the SYLLOG system.Pro-ceedings of the Sixth International Workshop on Machine Learning (pp.404-407) Ithaca, NY: Morgan Kaufmann.Google Scholar
  30. Medin, D.L., & Schaffer, M.M. (1978) Context theory of classification learning.Psychological Review, 85, 207-238.Google Scholar
  31. Michalski, R.S., & Larson, J.B. (1978) Selection of most representative training examples and incremental generation of VL1 hypotheses:The underlying methodology and the description of programs ESEL and AQ11 (Technical Report 867) Urbana, IL: University of Illinois, Department of Computer Science.Google Scholar
  32. Michalski, R.S., Mozetic, I., Hong, J., & Lavrac, N. (1986) The multi-purpose incremental learning system AQ15 and its testing application to three medical domains.Proceedings of the Fifth National Conference on Artificial Intelligence (pp.1041-1045) Philadelphia, PA: Morgan Kaufmann.Google Scholar
  33. Michie, D., Muggleton, S., Riese, C., & Zubrick, S. (1984) Rulemaster:A second-generation knowledge-engineering facility.1984 Conference on Artificial Intelligence and Applications.Google Scholar
  34. Nosofsky, R.M. (1986) Attention, similarity, and the identification-categorization relationship.Journal of Experi-mental Psychology:General, 15,39-57.Google Scholar
  35. Quinlan, J.R. (1986) Induction of decision trees.Machine Learning, 1, 81-106.Google Scholar
  36. Quinlan, J.R. (1987) Generating production rules from decision trees.Proceedings of the Tenth International Joint Conference on Artificial Intelligence (pp.304-307) Milan, Italy: Morgan Kaufmann.Google Scholar
  37. Quinlan, J.R. (1988) An empirical comparison of genetic and decision-tree classifiers.Proceedings of the Fifth International Conference on Machine Learning (pp.135-141) Ann Arbor, MI: Morgan Kaufmann.Google Scholar
  38. Quinlan, J.R., Compton, P.J., Horn, K.A., & Lazurus, L. (1986) Inductive knowledge acquisition:A case study.Proceedings of the Second Australian Conference on Applications of Expert Systems.Sydney, Australia.Google Scholar
  39. Rendell, L. (1988) Learning hard concepts.Proceedings of the Third European Working Session on Learning (pp.177-200) Glasgow, Scotland: Pitman Publishing.Google Scholar
  40. Rissland, E.L., Kolodner, J., & Waltz, D. (1989) Case-based reasoning from DARPA:Machine learning pro-gram plan.Proceedings of the Case-Based Reasoning Workshop (pp.1-13) Pensecola Beach, FL: Morgan Kaufmann.Google Scholar
  41. Rosenblatt, F. (1962) Principles of neurodynamics.New York, NY: Spartan.Google Scholar
  42. Rumelhart, D.E., McClelland, J.L., & The PDP Research Group (Eds.)(1987) Parallel distributed processing: Explorations in the microstructure of cognition (Vol.1) Cambridge, MA: MIT Press.Google Scholar
  43. Shamos, M.I., & Hoey, D. (1975) Closest point problems.Proceedings of the Sixteenth Annual Institute of Elec-trical and Electronic Engineers Symposium on the Foundations of Computer Science (pp.151-162.IEEE Com-puter Society.Google Scholar
  44. Salzberg, S. (1988) Exemplar-based learning:Theory and implementation (Technical Report TR-10-88) Cambridge, MA: Harvard University, Center for Research in Computing Technology.Google Scholar
  45. Schlimmer, J.C., & Fisher, D. (1986) A case study of incremental concept induction.Proceedings of the Fifth National Conference on Artificial Intelligence (pp.496-501) Philadelphia, PA: Morgan Kaufmann.Google Scholar
  46. Smith, E.E., & Medin, D.L. (1981) Categories and concepts.Cambridge, MA: Harvard University Press.Google Scholar
  47. Stanfill, C., & Waltz, D. (1986) Toward memory-based reasoning.Communications of the ACM, 29,1213-1228.Google Scholar
  48. Utgoff, P.E. (1989) Incremental induction of decision trees.Machine Learning, 4,161-186.Google Scholar
  49. Valiant, L.G. (1984) A theory of the learnable.Communications of the Association for Computing Machinery, 27,1134-1142.Google Scholar
  50. Van de Velde, W. (1989) IDL, or taming the multiplexer.Proceedings of the Fourth European Working Session on Learning (pp.211-225) Montpellier, France: Morgan Kaufmann.Google Scholar
  51. Volper, D.J., & Hampson, S.E. (1987) Learning and using specific instances.Biological Cybernetics, 57,57-71.Google Scholar

Copyright information

© Kluwer Academic Publishers 1991

Authors and Affiliations

  • David W. Aha
    • 1
  • Dennis Kibler
    • 2
  • Marc K. Albert
    • 3
  1. 1.Department of Information and Computer ScienceUniversity of CaliforniaIrvineCA
  2. 2.Department of Information and Computer ScienceUniversity of CaliforniaIrvineCA
  3. 3.Department of Information and Computer ScienceUniversity of CaliforniaIrvineCA

Personalised recommendations