An Algorithm for Incremental Nearest Neighbor Search in High-Dimensional Data Spaces
The SPY-TEC (Spherical Pyramid-Technique)  was proposed as a new indexing method for high-dimensional data spaces using a special partitioning strategy that divides a d-dimensional data space into 2d spherical pyramids. Although the authors of  proposed an efficient algorithm for processing hyperspherical range queries, they did not propose an algorithm for processing k-nearest neighbor queries that are frequently used in similarity search. In this paper, we propose an efficient algorithm for processing exact nearest neighbor queries on the SPY-TEC by extending the incremental nearest neighbor algorithm proposed in . We also introduce a metric that can be used to guide an ordered best-first traversal when finding nearest neighbors on the SPYTEC. Finally, we show that our technique significantly outperforms the related techniques in processing k-nearest neighbor queries by comparing it to the R*-tree, the X-tree, and the sequential scan through extensive experiments.
KeywordsSimilarity Search High-Dimensional Index Technique Nearest Neighbor Query Incremental Nearest Neighbor Algorithm Approximate Nearest Neighbor Algorithm SPY-TEC
Unable to display preview. Download preview PDF.
- 1.A. Guttman. “R-trees: a dynamic index structure for spatial searching”. Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 47–57, June 1984.Google Scholar
- 2.A. Henrich. “The LSDh-Tree: An Access Structure for Feature Vectors”. Proc. 14th Int. Conf on Data Engineering, pages 362–369, 1998.Google Scholar
- 5.C.E. Jacobs, A. Finkelstein, and D.H. Salesin. “Fast Multiresolution Image Query”. Proc. of the 1995 ACM SIGGRAPH, New York, 1995.Google Scholar
- 6.D.A. White and R. Jain. “Similarity Indexing with the SS-tree”. Proc. 12th Int. Conf on Data Engineering, pages 516–523, 1996.Google Scholar
- 9.C. Faloutsos. “Fast Searching by Content in Multimedia Databases”. Data Engineering Bulletin, 18(4), 1995.Google Scholar
- 12.J.R. Smith and S.-F. Chang. “VisualSEEk: a fully automated content-based image query system”. ACM Multimedia 96, Boston, MA, 1996.Google Scholar
- 13.J.T. Robinson. “The K-D-B-tree: a Search Structure for Large Multidimensional Dynamic Indexes”. Proc. ACM SIGMOD, Ann Arbor, USA, pages 10–18, April 1981.Google Scholar
- 15.L. Leithold. “Trigonometry”. Addison-Wesley, 1989.Google Scholar
- 16.D.H. Lee and H.J. Kim. “An Efficient Nearest Neighbor Search in High-Dimensional Data Spaces”. Seoul National University, CE Technical Report (OOPSLA-TR1028), http://oopsla.snu.ac.kr/~dhlee/OOPSLA-TR1028.ps, 2000.
- 17.N. Katayama and S. Satoh. “The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries”. Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 517–542, May 1997.Google Scholar
- 18.N. Roussopoulos, S. Kelley, and F. Vincent. “Nearest Neighbor Queries”. Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 71–79, 1995.Google Scholar
- 19.K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. “When Is “Nearest Neighbor” Meaningful ? ”. Proc. 7th Int. Conf. on Database Teory, pages 217–235, January 1999.Google Scholar
- 20.S. Berchtold, C. Böhm, and H.-P. Kriegel. “The Pyramid-Technique: Towards Breaking the Curse of Dimensionality”. Proc. ACM SIGMOD Int. Conf. on Management of Data, 1998.Google Scholar
- 21.S. Berchtold, C. Böhm, D.A. Keim, and H.-P. Kriegel. “A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space”. ACM PODS Symposium on Principles of Database Systems, Tucson, Arizona, 1997.Google Scholar
- 22.S. Berchtold, D.A. Keim, and H.-P. Kriegel. “The X-tree: An Indexing Structure for High-Dimensional Data”. Proc. 22nd Int. Conf. on Very Large Database, pages 28–39, September 1996.Google Scholar
- 23.P.M. Kelly, T.M. Cannon and D.R. Hush. “Query by image example: the CANDID approach”. Proc. SPIE Storage and Retrieval for Image and Video Databases III, 2420: 238–248, 1995.Google Scholar