A Statistical Confidence-Based Adaptive Nearest Neighbor Algorithm for Pattern Classification
The k-nearest neighbor rule is one of the simplest and most attractive pattern classification algorithms. It can be interpreted as an empirical Bayes classifier based on the estimated a posteriori probabilities from the k nearest neighbors. The performance of the k-nearest neighbor rule relies on the locally constant a posteriori probability assumption. This assumption, however, becomes problematic in high dimensional spaces due to the curse of dimensionality. In this paper we introduce a locally adaptive nearest neighbor rule. Instead of using the Euclidean distance to locate the nearest neighbors, the proposed method takes into account the effective influence size of each training example and the statistical confidence with which the label of each training example can be trusted. We test the new method on real-world benchmark datasets and compare it with the standard k-nearest neighbor rule and the support vector machines. The experimental results confirm the effectiveness of the proposed method.
KeywordsSupport Vector Machine Class Label Majority Rule Near Neighbor Query Point
Unable to display preview. Download preview PDF.
- 1.Fix, E., Hodges, J.: Discriminatory analysis, nonparametric discrimination: consistency properties. Tech. Report 4, USAF School of Aviation Medicine, Randolph Field, Texas (1951)Google Scholar
- 7.Friedman, J.: Flexible metric nearest neighbor classification. Technical Report 113, Stanford University Statistics Department (1994)Google Scholar
- 11.Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases, Dept. of Information and Computer Sciences, University of California, Irvine (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html