Adaptive k-Nearest-Neighbor Classification Using a Dynamic Number of Nearest Neighbors
Classification based on k-nearest neighbors (kNN classification) is one of the most widely used classification methods. The number k of nearest neighbors used for achieving a high accuracy in classification is given in advance and is highly dependent on the data set used. If the size of data set is large, the sequential or binary search of NNs is inapplicable due to the increased computational costs. Therefore, indexing schemes are frequently used to speed-up the classification process. If the required number of nearest neighbors is high, the use of an index may not be adequate to achieve high performance. In this paper, we demonstrate that the execution of the nearest neighbor search algorithm can be interrupted if some criteria are satisfied. This way, a decision can be made without the computation of all k nearest neighbors of a new object. Three different heuristics are studied towards enhancing the nearest neighbor algorithm with an early-break capability. These heuristics aim at: (i) reducing computation and I/O costs as much as possible, and (ii) maintaining classification accuracy at a high level. Experimental results based on real-life data sets illustrate the applicability of the proposed method in achieving better performance than existing methods.
KeywordskNN classification multidimensional data performance
Unable to display preview. Download preview PDF.
- 1.Aha, D.W.: Editorial. Artificial Intelligence Review (Special Issue on Lazy Learning) 11(1-5), 1–6 (1997)Google Scholar
- 4.Beckmann, N., Kriegel, H.-P., Schneider, R., Seeger, B.: The r*-tree: An efficient and robust access method for points and rectangles. In: Proceedings of the ACM SIGMOD Conference, pp. 590–601. ACM Press, New York (1990)Google Scholar
- 7.Dasarathy, B.V.: Nearest Neighbor Norms: NN Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos (1991)Google Scholar
- 8.Frey, P.W., Slate, D.J.: Letter recognition using holland-style adaptive classifiers. Machine Learning 6(2), 161–182 (1991)Google Scholar
- 9.Guttman, A.: R-trees: A dynamic index structure for special searching. In: Proceedings of the ACM SIGMOD Conference, pp. 47–57. ACM Press, New York (1984)Google Scholar
- 10.Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2000)Google Scholar