An average-case analysis of k-nearest neighbor classifier

Okamoto, Seishi; Satoh, Ken

doi:10.1007/3-540-60598-3_23

Seishi Okamoto¹ &
Ken Satoh¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1010))

Included in the following conference series:

International Conference on Case-Based Reasoning

171 Accesses
10 Citations

Abstract

In this paper, we perform an average-case analysis of k-nearest neighbor classifier (k-NNC) for a subclass of Boolean threshold functions. Our average-case analysis is based on the formal computation for the predictive accuracy of the classifier under the assumption of noise-free Boolean features and a uniform instance distribution. The predictive accuracy is represented as a function of the number of features, the threshold, the number of training instances, and the number of nearest neighbors. We also present the predictive behavior of the classifier by systematically varying the values of the parameters of the accuracy function. We plot the behavior of the classifier by varying the value of k, and then we observe that the performance of the classifier improves as k increases, then reaches a maximum before starting to deteriorate. We further investigate the relationship between the number of training instances and the optimal value of k. We then observe that optimum k increases gradually as the number of training instances increases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aha, D. W. Incremental Instance-Based Learning of Independent and Graded Concept Descriptions. Proceedings of the Sixth International Workshop on Machine Learning, 387–391, 1989.
Google Scholar
Aha, D. W. and Kibler, D. Noise-Tolerant Instance-Based Learning Algorithms. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (IJCAI'89), 794–799, 1989.
Google Scholar
Aha, D. W., Kibler, D. and Albert, M. K. Instance-Based Learning Algorithms. Machine Learning, 6, 37–66, 1991.
Google Scholar
Albert, M. K. and Aha, D. W. Analyses of Instance-Based Learning Algorithms. Proceedings of the Ninth National Conference on Artificial Intelligence (AAAI'91), 553–558, 1991.
Google Scholar
Bailey, T. and Jain, A. A Note on Distance-weighted K-Nearest Neighbor Rules. IEEE Transactions on Systems, Man, and Cybernetics, 8(4), 311–313, 1978.
Google Scholar
Cardie, C. Using Decision Trees to Improve Case-Based Learning. Proceedings of the Tenth International Conference on Machine Learning, 25–32, 1993.
Google Scholar
Cost, S. and Salzberg, S. A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features. Machine Learning, 10, 57–78, 1993.
Google Scholar
Cover, T. M. and Hart, P. E. Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory, 13(1), 21–27, 1967.
Google Scholar
Cover, T. M. and Hart, P. E. Estimation by the Nearest Neighbor Rule. IEEE Transactions on Information Theory, 14 (1), 50–55, 1968.
Google Scholar
Dudani, S. A. The Distance-Weighted k-Nearest-Neighbor Rule. IEEE Transactions on Systems, Man, and Cybernetics, 6(4), 325–327, 1976.
Google Scholar
Hirschberg, D. S. and Pazzani, M. J. Average-Case Analysis of learning k-CNF concept. Proceedings of the Ninth International Conference on Machine Learning, 206–211, 1992.
Google Scholar
Kelly, Jr. J. D. and Davis, L. A Hybrid Genetic Algorithm for Classification. Proceedings of the Twelfth International Joint Conference on Artificial Intelligence (IJ-CAI'91), 645–650, 1991.
Google Scholar
Langley, P., Iba, W., and Thompson, K. An Analysis of Bayesian Classifiers. Proceedings of the Tenth National Conference on Artificial Intelligence (AAAI'92), 223–228, 1992.
Google Scholar
Langley, P. and Iba, W. Average-Case Analysis of a Nearest Neighbor Algorithm. Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI'93), 889–894, 1993.
Google Scholar
Murphy, P. M. and Pazzani, M. J. ID2-of-3: Constructive Induction of M-of-N Concepts for Discriminators in Decision Trees. Proceedings of the Eighth International Workshop on Machine Learning, 183–187, 1991.
Google Scholar
Okamoto, S. and Satoh, K. A Mathematical Predictive Accuracy for the Nearest Neighbor Classifier. Proceedings of Second European Workshop on Case-Based Reasoning (EWCBR'94), 347–355, 1994.
Google Scholar
Pazzani, M. J. and Sarrentt, W. A Framework for Average Case Analysis of Conjunctive Learning Algorithms. Machine Learning, 9, 349–372, 1992.
Google Scholar
Pitt, L. and Valiant, L. G. Computational Limitations on Learning from Examples. the Association for Computing Machinery, 35(4), 965–984, 1988.
Google Scholar
Rachlin, J., Kasif, S., Salzberg, S., and Aha, D. W. Toward a Better Understanding of Memory-Based Reasoning Systems. Proceedings of the Eleventh International Conference on Machine Learning, 242–250, 1994.
Google Scholar
Satoh, K. and Okamoto, S. Toward PAC-Learning of Weights from Qualitative Distance Information. Proceedings of AAAI'94 Workshop on CBR, 128–132, 1994.
Google Scholar
Skalak, D. B. Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms. Proceedings of the Eleventh International Conference on Machine Learning, 293–301, 1994.
Google Scholar
Stanfill, C. and Waltz, D. L. Toward Memory-Based Reasoning. Communication of the Association for Computing Machinery, 29(12), 1213–1228, 1986.
Google Scholar

Download references

Author information

Authors and Affiliations

Fujitsu Laboratories Limited, 1015 Kamikodanaka, Nakahara-ku, 211, Kawasaki, Japan
Seishi Okamoto & Ken Satoh

Authors

Seishi Okamoto
View author publications
You can also search for this author in PubMed Google Scholar
Ken Satoh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Manuela Veloso Agnar Aamodt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Okamoto, S., Satoh, K. (1995). An average-case analysis of k-nearest neighbor classifier. In: Veloso, M., Aamodt, A. (eds) Case-Based Reasoning Research and Development. ICCBR 1995. Lecture Notes in Computer Science, vol 1010. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60598-3_23

Download citation

DOI: https://doi.org/10.1007/3-540-60598-3_23
Published: 05 August 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60598-0
Online ISBN: 978-3-540-48446-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics