An Empirical Comparison of Exact Nearest Neighbour Algorithms
Nearest neighbour search (NNS) is an old problem that is of practical importance in a number of fields. It involves finding, for a given point q, called the query, one or more points from a given set of points that are nearest to the query q. Since the initial inception of the problem a great number of algorithms and techniques have been proposed for its solution. However, it remains the case that many of the proposed algorithms have not been compared against each other on a wide variety of datasets. This research attempts to fill this gap to some extent by presenting a detailed empirical comparison of three prominent data structures for exact NNS: KD-Trees, Metric Trees, and Cover Trees. Our results suggest that there is generally little gain in using Metric Trees or Cover Trees instead of KD-Trees for the standard NNS problem.
Unable to display preview. Download preview PDF.
- 4.Liu, T., Moore, A.W., Gray, A.G.: Efficient exact k-NN and nonparametric classification in high dimensions. In: Proc. of NIPS 2003, MIT Press, Cambridge (2004)Google Scholar
- 5.Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proc. 13th Annual ACM symposium on Theory of Computing, pp. 604–613. ACM Press, New York (1998)Google Scholar
- 7.Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proc. 20th Annual Symposium on Computational Geometry, pp. 253–262. ACM Press, New York (2004)Google Scholar
- 10.Mount, D.M., Arya, S.: ANN: A library for approximate nearest neighbor searching. In: CGC 2nd Annual Fall Workshop on Computational Geometry (1997), Available from http://www.cs.umd.edu/~mount/ANN
- 11.Kibriya, A.M.: Fast algorithms for nearest neighbour search. Master’s thesis, Department of Computer Science, University of Waikato, New Zealand (2007)Google Scholar
- 12.Omohundro, S.M.: Five balltree construction algorithms. Technical Report TR-89-063, International Computer Science Institute (December 1989)Google Scholar
- 14.Moore, A.W.: The anchors hierarchy: Using the triangle inequality to survive high dimensional data. In: Proc. 16th Conference on Uncertainty in Artificial Intelligence, pp. 397–405. Morgan Kaufmann, San Francisco (2000)Google Scholar