Advertisement

An Algorithm for Incremental Nearest Neighbor Search in High-Dimensional Data Spaces

  • Dong-Ho Lee
  • Hyung-Dong Lee
  • Il-Hwan Choi
  • Hyoung-Joo Kim
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2105)

Abstract

The SPY-TEC (Spherical Pyramid-Technique) [8] was proposed as a new indexing method for high-dimensional data spaces using a special partitioning strategy that divides a d-dimensional data space into 2d spherical pyramids. Although the authors of [8] proposed an efficient algorithm for processing hyperspherical range queries, they did not propose an algorithm for processing k-nearest neighbor queries that are frequently used in similarity search. In this paper, we propose an efficient algorithm for processing exact nearest neighbor queries on the SPY-TEC by extending the incremental nearest neighbor algorithm proposed in [10]. We also introduce a metric that can be used to guide an ordered best-first traversal when finding nearest neighbors on the SPYTEC. Finally, we show that our technique significantly outperforms the related techniques in processing k-nearest neighbor queries by comparing it to the R*-tree, the X-tree, and the sequential scan through extensive experiments.

Keywords

Similarity Search High-Dimensional Index Technique Nearest Neighbor Query Incremental Nearest Neighbor Algorithm Approximate Nearest Neighbor Algorithm SPY-TEC 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    A. Guttman. “R-trees: a dynamic index structure for spatial searching”. Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 47–57, June 1984.Google Scholar
  2. 2.
    A. Henrich. “The LSDh-Tree: An Access Structure for Feature Vectors”. Proc. 14th Int. Conf on Data Engineering, pages 362–369, 1998.Google Scholar
  3. 3.
    C. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack, D. Petkovic, and W. Equiz. “Efficient and Effective Querying by Image Content”. Journal of Intelligent Information System(JIIS), 3(3):231–262, July 1994.CrossRefGoogle Scholar
  4. 4.
    B.C. Ooi, K.L. Tan, T.S. Chua, and W. Hsu. “Fast image retrieval using colorspatial information”. The VLDB Journal, 7(2):115–128, 1998.CrossRefGoogle Scholar
  5. 5.
    C.E. Jacobs, A. Finkelstein, and D.H. Salesin. “Fast Multiresolution Image Query”. Proc. of the 1995 ACM SIGGRAPH, New York, 1995.Google Scholar
  6. 6.
    D.A. White and R. Jain. “Similarity Indexing with the SS-tree”. Proc. 12th Int. Conf on Data Engineering, pages 516–523, 1996.Google Scholar
  7. 7.
    D.B. Lomet and B. Salzberg. “The hB-Tree: A Multiattribute Indexing Method with Good Guaranteed Performance”. ACM Transaction on Database Systems, 15(4):625–658, 1990.CrossRefGoogle Scholar
  8. 8.
    D.H. Lee and H.J. Kim. “SPY-TEC: An Efficient Indexing Method for Similarity Search in High-Dimensional Data Spaces”. Data & Knowledge Engineering, 34(1):77–97, 2000.MATHCrossRefGoogle Scholar
  9. 9.
    C. Faloutsos. “Fast Searching by Content in Multimedia Databases”. Data Engineering Bulletin, 18(4), 1995.Google Scholar
  10. 10.
    G.R. Hjaltason and H. Samet. “Distance Browsing in Spatial Databases”. ACM Transaction on Database Systems, 24(2):265–318, 1999.CrossRefGoogle Scholar
  11. 11.
    J. Bentley. “Mutidimensional binary search trees used for associative searching”. Communications of the ACM, 18(9):509–517, 1975.MATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    J.R. Smith and S.-F. Chang. “VisualSEEk: a fully automated content-based image query system”. ACM Multimedia 96, Boston, MA, 1996.Google Scholar
  13. 13.
    J.T. Robinson. “The K-D-B-tree: a Search Structure for Large Multidimensional Dynamic Indexes”. Proc. ACM SIGMOD, Ann Arbor, USA, pages 10–18, April 1981.Google Scholar
  14. 14.
    K.-I. Lin, H.V. Jagadish, and C. Faloutsos. “The TV-tree: An Index Structure for High-Dimensional Data”. The VLDB Journal, 3(4):517–542, 1994.CrossRefGoogle Scholar
  15. 15.
    L. Leithold. “Trigonometry”. Addison-Wesley, 1989.Google Scholar
  16. 16.
    D.H. Lee and H.J. Kim. “An Efficient Nearest Neighbor Search in High-Dimensional Data Spaces”. Seoul National University, CE Technical Report (OOPSLA-TR1028), http://oopsla.snu.ac.kr/~dhlee/OOPSLA-TR1028.ps, 2000.
  17. 17.
    N. Katayama and S. Satoh. “The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries”. Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 517–542, May 1997.Google Scholar
  18. 18.
    N. Roussopoulos, S. Kelley, and F. Vincent. “Nearest Neighbor Queries”. Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 71–79, 1995.Google Scholar
  19. 19.
    K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. “When Is “Nearest Neighbor” Meaningful ? ”. Proc. 7th Int. Conf. on Database Teory, pages 217–235, January 1999.Google Scholar
  20. 20.
    S. Berchtold, C. Böhm, and H.-P. Kriegel. “The Pyramid-Technique: Towards Breaking the Curse of Dimensionality”. Proc. ACM SIGMOD Int. Conf. on Management of Data, 1998.Google Scholar
  21. 21.
    S. Berchtold, C. Böhm, D.A. Keim, and H.-P. Kriegel. “A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space”. ACM PODS Symposium on Principles of Database Systems, Tucson, Arizona, 1997.Google Scholar
  22. 22.
    S. Berchtold, D.A. Keim, and H.-P. Kriegel. “The X-tree: An Indexing Structure for High-Dimensional Data”. Proc. 22nd Int. Conf. on Very Large Database, pages 28–39, September 1996.Google Scholar
  23. 23.
    P.M. Kelly, T.M. Cannon and D.R. Hush. “Query by image example: the CANDID approach”. Proc. SPIE Storage and Retrieval for Image and Video Databases III, 2420: 238–248, 1995.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Dong-Ho Lee
  • Hyung-Dong Lee
  • Il-Hwan Choi
  • Hyoung-Joo Kim
    • 1
  1. 1.OOPSLA Laboratory School of Computer Science and EngineeringSeoul National UniversitySeoulKOREA

Personalised recommendations