Theoretical analysis of case retrieval method based on neighborhood of a new problem
The retrieval of similar cases is often performed by using the neighborhood of a new problem. The neighborhood is usually denned by a certain fixed number of most similar cases (k nearest neighbors) to the problem. This paper deals with an alternative definition of neighborhood that comprises the cases within a certain distance, d, from the problem. We present an average-case analysis of a classifier, the d-nearest neighborhood method (d-NNh), that retrieves cases in this neighborhood and predicts their majority class as the class of the problem. Our analysis deals with m-of-n/l target concepts, and handles three types of noise. We formally compute the expected classification accuracy of d-NNh, then we explore the predicted behavior of d-NNh. By combining this exploration for d-NNh and one for k-nearest neighbor method (k-NN) in our previous study, we compare the predicted behavior of each in noisy domains. Our formal analysis is supported with Monte Carlo simulations.
Unable to display preview. Download preview PDF.
- 1.Aha, D., Kibler, D., and Albert, M. Instance-Based Learning Algorithms. Machine Learning, 6, (1991) 37–66.Google Scholar
- 2.Albert, M. and Aha, D. Analyses of Instance-Based Learning Algorithms. In Proceedings of AAAI-91, (1991) 553–558. AAAI Press/MIT Press.Google Scholar
- 3.Cover. T. and Hart, P. Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory. 13(1). (1967) 21–27.Google Scholar
- 4.Creecy, H., Masand, M., Smith, J., and Waltz, D. Trading Mips and Memory for Knowledge Engineering. Communications of the A CM, 35(8), (1992) 48–63.Google Scholar
- 5.Drakopoulos, J. Bounds on the Classification Error of the Nearest Neighbor Rule. In Proceedings of ICML-95, (1995) 203–208. Morgan Kaufmann.Google Scholar
- 6.Langley. P. and Iba, W. Average-Case Analysis of a Nearest Neighbor Algorithm. In Proceedings of IJCAI-93, (1993) 889–894. Morgan Kaufmann.Google Scholar
- 7.Murphy, P. and Pazzani, M. ID2-of-3: Constructive Induction of M-of-N Concepts for Discriminators in Decision Trees. In Proceedings of IWML-91, (1991) 183–187. Morgan Kaufmann.Google Scholar
- 8.O'Callaghan, J. P. An Alternative Definition for Neighborhood of a Point, IEEE Transactions on Computers, 24(11), (1975) 1121–1125.Google Scholar
- 9.Okamoto, S. and Satoh, K. An Average-Case Anaysis of k-Nearest Neighbor Cassifier. In Proceedings of ICCBR-95 (Veoso, M. and Aamodt, A. Eds., LNAI, 1010), (1995) 243–264. Springer-Verag.Google Scholar
- 10.Okamoto, S. and Yugami, N. Theoretica Anaysis of the Nearest Neighbor Cassifier in Noisy Domains. In Proceedings of ICML-96 (1996) 355–363. Morgan Kaufmann.Google Scholar
- 11.Okamoto, S. and Yugami, N. An Average-Case Anaysis of the k-Nearest Neighbor Cassifier for Noisy Domains. In Proceedings of IJCAI-97, (1997) to appear. Morgan Kaufmann.Google Scholar
- 12.Pazzani, M. and Sarrett, W. A Framework for Average Case Anaysis of Conjunctive Learning Agorithms. Machine Learning, 9 (1992) 349–372.Google Scholar
- 13.Wettschereck, D. and Aha, D. Weighting Features. In Proceedings of ICCBR-95 (Veoso, M. and Aamodt, A. Eds., LNAI, 1010), (1995) 347–358. Springer-Verag.Google Scholar