Nearest Neighbor
Synonyms
Definition
In a data collection M, the nearest neighbor to a data object q is the data object Mi, which minimizes dist (q, Mi), where dist is a distance measure defined for the objects in question. Note that the fact that the object Mi is the nearest neighbor to q does not imply that q is the nearest neighbor to Mi.
Motivation and Background
Nearest neighbors are useful in many machine learning and data mining tasks, such as classification, anomaly detection, and motif discovery and in more general tasks such as spell checking, vector quantization, plagiarism detection, web search, and recommender systems.
The naive method to find the nearest neighbor to a point q requires a linear scan of all objects in M. Since this may be unacceptably slow for large datasets and/or computationally demanding distance measures, there is a huge amount of literature on speeding up nearest neighbor searches (query-by-content). The fastest methods depend on...
Recommended Reading
- Guttman, A. (1984). R-trees: A dynamic index structure for spatial searching. In Proceedings of the 1984 ACM SIGMOD international conference on management of data (pp. 47–57). New York: ACM. ISBN 0-89791-128-8Google Scholar
- Manolopoulos, Y., Nanopoulos, A., Papadopoulos, A. N., & Theodoridis, Y. (2005). R-trees: Theory and applications. Berlin: Springer.Google Scholar
- Zezula, P., Amato, G., Dohnal, V., & Batko, M. (2005). Similarity search: The metric space approach. In Advances in database systems (Vol. 32, p. 220). New York: Springer. ISBN 0-387-29146-6Google Scholar