Advertisement

Cybernetics and Systems Analysis

, Volume 53, Issue 4, pp 636–658 | Cite as

Distance-Based Index Structures for Fast Similarity Search

  • D. A. Rachkovskij
Article

Abstract

This review considers the class of index structures for fast similarity search. In constructing and applying such structures, only information on values or ranks of some distances/similarities between objects is used. The search by metric distances (satisfying the triangle inequality and other metric axioms) and by nonmetric distances is discussed. Structures that return objects of a base that represent the exact answer to a search query and also structures for approximate similarity search are presented (the latter structures do not guarantee precision, but usually return results close to exact and operate faster than structures for exact search). General principles of construction and application of some index structures are stated, and also ideas underlying concrete algorithms (both well-known and proposed lately) are considered.

Keywords

similarity search nearest neighbor search index structure distance-based indexing metric distance nonmetric distance metric tree neighborhood graph branch and bound method 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    R. Datta, D. Joshi, J. Li, and J. Wang, “Image retrieval: Ideas, influences, and trends of the new age,” ACM Computing Surveys, Vol. 40, No. 2, 1–60 (2008).CrossRefGoogle Scholar
  2. 2.
    C. Manning, P. Raghavan, and H. Schutze, Introduction to Information Retrieval, Cambridge University Press, New York (2008).CrossRefMATHGoogle Scholar
  3. 3.
    R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd Edition, Wiley, New York (2001).MATHGoogle Scholar
  4. 4.
    R. Lopez De Mantaras, D. Mcsherry, D. Bridge, D. Leake, B. Smyth, S. Craw, B. Faltings, M. L. Maher, M. T. Cox, K. Forbus, M. Keane, A. Aamodt, and I. Watson, “Retrieval, reuse, revision and retention in case-based reasoning,” Knowledge Engineering Review. Vol. 20, No. 3, 215–240 (2005).CrossRefGoogle Scholar
  5. 5.
    M. G. Voskoglou and A.-B. M. Salem, “Analogy-based and case-based reasoning: Two sides of the same coin,” IJAFSAI, Vol. 4, 5–51 (2014).Google Scholar
  6. 6.
    C. M. Wharton, K. J. Holyoak, P. E. Downing, T. E. Lange, T. D. Wickens, and E. R. Melz, “Below the surface: Analogical similarity and retrieval competition reminding,” Cognitive Psychology, Vol. 26, 64–101 (1994).Google Scholar
  7. 7.
    D. Gentner and L. Smith, “Analogical reasoning,” in: V. S. Ramachandran (ed.), Encyclopedia of Human Behavior, Vol. 1, 2nd ed., Elsevier, Oxford, UK (2012), pp 130–136.Google Scholar
  8. 8.
    D. A. Rachkovskij and S. V. Slipchenko, “Similarity-based retrieval with structure-sensitive sparse binary distributed representations,” Computational Intelligence, Vol. 28, No. 1, 106–129 (2012).MathSciNetCrossRefGoogle Scholar
  9. 9.
    K. Forbus, R. Ferguson, A. Lovett, and D. Gentner, “Extending SME to handle large-scale cognitive modeling,” DOI:  10.1111/cogs.12377 (2016).Google Scholar
  10. 10.
    D. A. Rachkovskij, “Real-valued embeddings and sketches for fast distance and similarity estimation,” Cybernetics and Systems Analysis, Vol. 52, No. 6, 967-988 (2016).MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    D. A. Rachkovskij, “Binary vectors for fast distance and similarity estimation,” Cybernetics and Systems Analysis, Vol. 53, No. 1, 138–156 (2017)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    E. Chavez, G. Navarro, R. Baeza-Yates, and J. L. Marroquin, “Searching in metric spaces,” ACM Computing Surveys, Vol. 33, No. 3, 273–321 (2001).CrossRefGoogle Scholar
  13. 13.
    G. R. Hjaltason and H. Samet, “Index-driven similarity search in metric spaces,” ACM Transactions on Database Systems, Vol. 28, No. 4, 517–580 (2003).CrossRefGoogle Scholar
  14. 14.
    H. Samet, Foundations of Multidimensional and Metric Data Structures, Morgan Kaufmann, San Francisco (2006).MATHGoogle Scholar
  15. 15.
    P. Zezula, G. Amato, V. Dohnal, and M. Batko, Similarity Search: The Metric Space Approach, Springer, New York (2006).MATHGoogle Scholar
  16. 16.
    A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions,” Communications of the ACM, Vol. 51, No. 1, 117–122 (2008).CrossRefGoogle Scholar
  17. 17.
    A. Andoni and P. Indyk, “Nearest neighbors in high-dimensional spaces,” in: Handbook of Discrete and Computational Geometry, Ch. 43, 3rd ed. (to appear) (2017).Google Scholar
  18. 18.
    K. Fukunaga and P. M. Narendra, “A branch and bound algorithm for computing k-nearest neighbors,” IEEE Trans. Comput., Vol. C-24, No. 7, 750–753 (1975).CrossRefMATHGoogle Scholar
  19. 19.
    J. Lokoc and T. Skopal, “On applications of parameterized hyperplane partitioning,” in: Proc. SISAP 10 (2010), pp. 131–132.Google Scholar
  20. 20.
    L. Cayton, “Efficient Bregman range search” in: Proc. NIPS 09 (2009), pp. 243–251.Google Scholar
  21. 21.
    R. Connor, L. Vadicamo, F. A. Cardillo, and F. Rabitti, “Supermetric search with the four-point property,” in: Proc. SISAP 16 (2016), pp. 51–64.Google Scholar
  22. 22.
    G. R. Hjaltason and H. Samet, “Properties of embedding methods for similarity searching in metric spaces,” IEEE Trans. PAMI, Vol. 25, No. 5, 530–549 (2003).Google Scholar
  23. 23.
    K. Clarkson, “Nearest-neighbor searching and metric space dimensions,” in: Nearest-Neighbor Methods for Learning and Vision: Theory and Practice, MIT Press (2006), pp. 15–59.Google Scholar
  24. 24.
    R. Weber, H. J. Schek, and S. Blott, “A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces,” in: Proc. VLDB 98 (1998), pp. 194–205.Google Scholar
  25. 25.
    C. Bohm, S. Berchtold, and D. A. Keim, “Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases,” ACM Com. Surv., Vol. 33, No. 3, 322–373 (2001).CrossRefGoogle Scholar
  26. 26.
    K. Beyer, J. Goldstein, R. Ramakhrishnan, and U. Shaft, “When is ”nearest neighbor" meaningful?" in: Proc. ICDT 99 (1999), pp. 217–235.Google Scholar
  27. 27.
    U. Shaft and R. Ramakrishnan, “Theory of nearest neighbors indexability,” ACM Trans. Database Syst., Vol. 31, 814–838 (2006).CrossRefGoogle Scholar
  28. 28.
    I. Volnyansky and V. Pestov, “Curse of dimensionality in pivot based indices,” in: Proc. SISAP 09 (2009), pp. 39–46.Google Scholar
  29. 29.
    V. Pestov, “Indexability, concentration, and VC theory,” Journal of Discrete Algorithms, Vol. 13, 2–18 (2012).Google Scholar
  30. 30.
    F. Camastra, “Data dimensionality estimation methods: A survey,” Pattern Recogn., Vol. 6, No 12, 2945–2954 (2003).CrossRefMATHGoogle Scholar
  31. 31.
    C. Traina, R. F. Santos Filho, A. J. M. Traina, M. R. Vieira, and C. Faloutsos, “The Omni-family of all-purpose access methods: A simple and effective way to make similarity search more efficient,” VLDB Journal, Vol. 16, No. 4, 483–505 (2007).CrossRefGoogle Scholar
  32. 32.
    T. Skopal and B. Bustos, “On nonmetric similarity search problems in complex domains,” ACM Comput. Surveys, Vol. 43, No 4, 34:1–34:50 (2011).Google Scholar
  33. 33.
    R. Mao, W. L. Mirankerb, and D. P. Mirankerc, “Pivot selection: Dimension reduction for distance-based indexing,” J. Discrete Algorithms, Vol. 13, 32–46( 2012).Google Scholar
  34. 34.
    M. Patella and P. Ciaccia, “Approximate similarity search: A multi-faceted problem,” J. Discrete Algorithms, Vol. 7, No. 1, 36–48 (2009).MathSciNetCrossRefMATHGoogle Scholar
  35. 35.
    D. M. W. Powers, “Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation,” J. of Machine Learning Tech., Vol. 2, No. 1, 37–63 (2011).MathSciNetGoogle Scholar
  36. 36.
    M. Muja and D. G. Lowe, “Scalable nearest neighbor algorithms for high dimensional data,” IEEE TPAMI, Vol. 36, No. 11, 2227–2240 (2014).CrossRefGoogle Scholar
  37. 37.
    G. Navarro, “Analyzing metric space indices: What for?” in: Proc. SISAP 09 (2009), pp. 3–10.Google Scholar
  38. 38.
    E. Vidal, “An algorithm for finding nearest neighbors in (approximately) constant average time,” Patt. Recog. Lett., Vol. 4, No. 3, 145–157 (1986).CrossRefGoogle Scholar
  39. 39.
    E. Vidal, “New formulation and improvements of the nearest-neighbor approximating and eliminating search algorithm (AESA),” Patt. Recog. Lett., Vol. 15, No. 1, 1–7 (1994).CrossRefGoogle Scholar
  40. 40.
    K. Figueroa, E. Chavez, G. Navarro, and R. Paredes, “Speeding up spatial approximation search in metric spaces,” ACM Journal of Experimental Algorithmics, Vol. 14, 3.6.1–3.6.21 (2009).Google Scholar
  41. 41.
    L. Mico, J. Oncina, and E. Vidal, “A new version of the nearest-neighbor approximating and eliminating search (AESA) with linear preprocessing-time and memory requirements,” Patt. Recog. Lett., Vol. 15, No 1, 9–17 (1994).CrossRefGoogle Scholar
  42. 42.
    S. Nene and S. Nayar, “A simple algorithm for nearest neighbor search in high dimensions,” IEEE Trans. PAMI, Vol. 19, No. 9, 989–1003 (1997).CrossRefGoogle Scholar
  43. 43.
    E. Chavez, J. Marroquín, and R. Baeza-Yates, “Spaghettis: An array based algorithm for similarity queries in metric spaces,” in: Proc. SPIRE 99 (1999), pp. 38–46.Google Scholar
  44. 44.
    I. Munro, R. Raman, V. Raman, and S. S. Rao, “Succinct representations of permutations and functions,” Theor. Comput. Sci., Vol. 438, 74–88 (2012).MathSciNetCrossRefMATHGoogle Scholar
  45. 45.
    E. Chavez, U. Ruiz, and E. Tellez, “CDA: Succinct spaghetti,” in: Proc. SISAP 15 (2015), 54–64.Google Scholar
  46. 46.
    K. Tokoro, K. Yamaguchi, and S. Masuda, “Improvements of TLAESA nearest neighbor search algorithm and extension to approximation search,” in: Proc. ACSC 06 (2006), pp. 77–83.Google Scholar
  47. 47.
    G. Ruiz, F. Santoyo, E. Chavez, K. Figueroa, and E. Tellez, “Extreme pivots for faster metric indices,” in: Proc. SISAP 13 (2013), pp. 115–126.Google Scholar
  48. 48.
    J. K. Uhlmann, “Satisfying general proximity/similarity queries with metric trees,” Information Processing Letters, Vol. 40, No. 4, 175–179 (1991).CrossRefMATHGoogle Scholar
  49. 49.
    P. N. Yianilos, “Data structures and algorithms for nearest neighbor search in general metric spaces,” in: Proc. SODA 93 (1993), pp. 311–321.Google Scholar
  50. 50.
    T. Chiueh, “Content-based image indexing,” in: Proc. VLDB 94 (1994), pp. 582–593.Google Scholar
  51. 51.
    T. Bozkaya and M. Ozsoyoglu, “Indexing large metric spaces for similarity search queries,” ACM Trans. Datab. Syst., Vol. 24, No. 3, 361–404 (1999).CrossRefGoogle Scholar
  52. 52.
    A. W.-C. Fu, P. M.-S. Chan, Y.-L. Cheung, and Y. S. Moon, “Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances,” VLDB Journal, Vol. 9, No. 2, 154–173 (2000).CrossRefGoogle Scholar
  53. 53.
    P. Yianilos, “Excluded middle vantage point forests for nearest neighbor search,” in: DIMACS Implementation Challenge, ALENEX 1999. URL: http://citeseer.ist.psu.edu/.
  54. 54.
    I. Kalantari and G. Mcdonald, “A data structure and an algorithm for the nearest point problem,” IEEE Trans. Softw. Eng., Vol. 9, No. 5, 631–634 (1983).CrossRefMATHGoogle Scholar
  55. 55.
    F. Dehne and H. Noltemeier, “Voronoi trees and clustering problems,” Information Systems, Vol. 12, No. 2, 171–175 (1987).CrossRefGoogle Scholar
  56. 56.
    H. Noltemeier, K. Verbarg, and C. Zirkelbach, “Monotonous bisector* trees — A tool for efficient partitioning of complex scenes of geometric objects,” LNCS, Vol. 594, 186–203 (1992).Google Scholar
  57. 57.
    P. Ciaccia, M. Patella, and P. Zezula, “Mtree: An efficient access method for similarity search in metric spaces,” in: Proc. VLDB 97 (1997), pp. 426–435.Google Scholar
  58. 58.
    P. Zezula, P. Savino, G. Amato, and F. Rabitti, “Approximate similarity retrieval with M-trees,” VLDB Journal, Vol. 7, No. 4, 275–293 (1998).CrossRefGoogle Scholar
  59. 59.
    T. Skopal, J. Pokorny, and V. Snasel, “PM-tree: Pivoting metric tree for similarity search in multimedia databases,” in: Proc. ADBIS 04 (2004), pp. 99–114.Google Scholar
  60. 60.
    S. Jin, O. Kim, and W. Feng, “MX-tree: A double hierarchical metric index with overlap reduction,” in: Proc. ICCSA 13 (2013), pp. 574–589.Google Scholar
  61. 61.
    S. Brin, “Near neighbor search in large metric spaces,” in: Proc. VLDB 95 (1995), pp. 574–584.Google Scholar
  62. 62.
    K. Fredriksson, Geometric Near-Neighbor Access Tree (GNAT) Revisited. arXiv:1605.05944. 20 May 2016.Google Scholar
  63. 63.
    G. Navarro and R. Uribe, “Fully dynamic metric access methods based on hyperplane partitioning,” Information Systems, Vol. 36, No. 4, 734–747 (2011).CrossRefGoogle Scholar
  64. 64.
    R. Connor, “Reference point hyperplane trees,” in: Proc. SYSAP 16 (2016), pp. 65–78.Google Scholar
  65. 65.
    S. O Hara and B. A. Draper, “Are you using the right approximate nearest neighbor algorithm?” in: Proc. WACV 13 (2013), pp. 9–14.Google Scholar
  66. 66.
    D. Comer, “The ubiquitous B-tree,” ACM Comput. Surv., Vol. 11, 121–138 (1979).CrossRefMATHGoogle Scholar
  67. 67.
    D. Novak and M. Batko, “Metric Index: An efficient and scalable solution for precise and approximate similarity search,” Information Systems, Vol. 36, No. 4, 721–733 (2011).CrossRefGoogle Scholar
  68. 68.
    J. Lokoc, J. Mosko, P. Cech, and T. Skopal, “On indexing metric spaces using cut-regions,” Information Systems, Vol. 43, 1–19 (2014).Google Scholar
  69. 69.
    L. Chen, Y. Gao, X. Li, C. S. Jensen, and G. Chen, “Efficient metric indexing for similarity search,” in: Proc. ICDE 15 (2015), pp. 591–602.Google Scholar
  70. 70.
    G. Navarro, “Searching in metric spaces by spatial approximation,” VLDB Journal, Vol. 11, No. 1, 28–46 (2002).CrossRefGoogle Scholar
  71. 71.
    G. Navarro and N. Reyes, “Dynamic spatial approximation trees,” Journal of Experimental Algorithmics, Vol. 12, Article 1.5 (2009).Google Scholar
  72. 72.
    M. Barroso, N. Reyes, and R. Paredes, “Enlarging nodes to improve spatial approximation trees,” in: Proc. SISAP 10 (2010), pp. 41–48.Google Scholar
  73. 73.
    G. Navarro and N. Reyes, “New dynamic metric indices for secondary memory,” Information Systems, Vol. 59, 48–78 (2016).Google Scholar
  74. 74.
    E. Chavez, V. Luduena, N. Reyes, and P. Roggero, “Faster proximity searching with the distal SAT,” Information Systems, Vol. 59, 15–47 (2016).Google Scholar
  75. 75.
    A. Beygelzimer, S. Kakade, and J. C. Langford, “Cover trees for nearest neighbor,” in Proc. ICML 06 (2006), pp. 97–104.Google Scholar
  76. 76.
    R. R. Curtin, Improving Dual-Tree Algorithms, Ph.D. Thesis, Georgia Inst. Tech. (2015).Google Scholar
  77. 77.
    E. Chavez and G. Navarro, “A compact space decomposition for effective metric indexing,” Pattern Recognition Letters, Vol. 26, No. 9, 1363–1376 (2005).CrossRefGoogle Scholar
  78. 78.
    P. Roggero, N. Reyes, K. Figueroa, and R. Paredes, “List of clustered permutations in secondary memory for proximity searching,” J. of Com. Science Tech., Vol. 15, No. 2, 107–113 (2015).Google Scholar
  79. 79.
    A. Ponomarenko, N. Avrelin, B. Naidan, and L. Boytsov, “Comparative analysis of data structures for approximate nearest neighbor search,” DATA ANALYTICS 2014 (2014), pp. 125–130.Google Scholar
  80. 80.
    V. Dohnal, C. Gennaro, P. Savino, and P. Zezula, “D-index: Distance searching index for metric data sets,” Multimedia Tools and Applications, Vol. 21, No. 1, 9–33 (2003).CrossRefGoogle Scholar
  81. 81.
    L. Cayton, “Accelerating nearest neighbor search on manycore systems,” in: Proc. IPDPS 12 (2012), pp. 402–413.Google Scholar
  82. 82.
    E. S. Tellez, G. Ruiz, and E. Chavez, “Singleton indices for nearest neighbor search,” Information Systems, Vol. 60, 50–68 (2016).Google Scholar
  83. 83.
    D. J. Rosenkrantz, R. E. Stearns, and P. M. Lewis, “II. An analysis of several heuristics for the traveling salesman problem,” SIAM Journal on Computing, Vol. 6, No. 3, 563–581 (1977).MathSciNetCrossRefMATHGoogle Scholar
  84. 84.
    T. F. Gonzalez, “Clustering to minimize the maximum intercluster distance,” Theoretical Computer Science, Vol. 38, 293–306 (1985).Google Scholar
  85. 85.
    B. Bustos, G. Navarro, and E. Chavez, “Pivot selection techniques for proximity searching in metric spaces,” Pattern Recogn. Lett., Vol. 24, 2357–2366 (2003).CrossRefMATHGoogle Scholar
  86. 86.
    N. R. Brisaboa, A. Farina, O. Pedreira, and N. Reyes, “Similarity search using sparse pivots for efficient multimedia information retrieval,” in: Proc. ISM 06 (2006), pp. 881–888.Google Scholar
  87. 87.
    R. H. Van Leuken and R. C. Veltkamp, “Selecting vantage objects for similarity indexing,” ACM Trans. Multimedia Comput. Commun. Appl., Vol. 7, 16:1–16:18 (2011).Google Scholar
  88. 88.
    S.-H. Kim, D.-Y. Lee, and H.-G. Cho, “An eigenvalue-based pivot selection strategy for improving search efficiency in metric spaces,” in: Proc. BigComp 16 (2016), pp. 207–214.Google Scholar
  89. 89.
    A. Berman and L. G. Shapiro, “Selecting good keys for triangle-inequality-based pruning algorithms,” in: Proc. CAIVD 98 (1998), pp. 12–19.Google Scholar
  90. 90.
    J. Venkateswaran, T. Kahveci, C. M. Jermaine, and D. Lachwani, “Reference-based indexing for metric spaces with costly distance measures,” VLDB Journal, Vol. 17, No. 5, 1231–1251 (2008).CrossRefGoogle Scholar
  91. 91.
    R. Mao, P. Zhang, X. Li, L. Xi, and M. Lu, “Pivot selection for metric-space indexing,” Int. J. Mach. Learn. Cybern., Vol. 7, No. 2, 311–323 (2016).CrossRefGoogle Scholar
  92. 92.
    C. Celik, “Effective use of space for pivot-based metric indexing structures,” in: Proc. SISAP 08 (2008), pp. 113–120.Google Scholar
  93. 93.
    M. L. Hetland, T. Skopal, J. Lokoc, and C. Beecks, “Ptolemaic access methods: Challenging the reign of the metric space model,” Information Systems, Vol. 38, No. 7, 989–1006 (2013).CrossRefGoogle Scholar
  94. 94.
    M. L. Hetland, “Ptolemaic indexing,” JoCG, Vol. 6, No. 1, 165–184 (2015).MathSciNetMATHGoogle Scholar
  95. 95.
    R. Connor, L. Vadicamo, F. A. Cardillo, and F. Rabitti, “Supermetric search with the four-point property,” in: Proc. SISAP 16 (2016), pp. 51–64.Google Scholar
  96. 96.
    P. Ciaccia and M. Patella, “Searching in metric spaces with user-defined and approximate distances,” ACM Database Systems, Vol. 27, No. 4, 398–437 (2002).CrossRefGoogle Scholar
  97. 97.
    L. Chen and X. Lian, “Efficient similarity search in nonmetric spaces with local constant embedding,” IEEE TKDE, Vol. 20, No. 3, 321–336 (2008).Google Scholar
  98. 98.
    T. Skopal and J. Lokoc, “NM-tree: Flexible approximate similarity search in metric and non-metric spaces,” in: Proc. DEXA 08 (2008), pp. 312–325.Google Scholar
  99. 99.
    R. R. Curtin, P. Ram, and A. G. Gray, “Fast exact max-kernel search,” in: Proc. SDM 13 (2013), pp. 1–9.Google Scholar
  100. 100.
    E. Keogh and C. Ratanamahatana, “Exact indexing of dynamic time warping,” Knowledge and Information Systems, Vol. 7, No. 3, 358–386 (2005).CrossRefGoogle Scholar
  101. 101.
    Z. Zhang, B. C. Ooi, S. Parthasarathy, and A. K. H. Tung, “Similarity search on Bregman divergence: Towards non-metric indexing,” in: Proc. VLDB Endowment, Vol. 2(2009), pp. 13–24.Google Scholar
  102. 102.
    A. Abdullah, J. Moeller, and S. Venkatasubramanian, “Approximate Bregman near neighbors in sublinear time: Beyond the triangle inequality,” in: Proc. SCG 12 (2012), pp. 31–40.Google Scholar
  103. 103.
    G. Amato and P. Savino, “Approximate similarity search in metric spaces using inverted files,” in: Proc. InfoScale 08 (2008), pp. 28:1–28:10.Google Scholar
  104. 104.
    E. Chavez, K. Figueroa, and G. Navarro,“Effective proximity retrieval by ordering permutations,” IEEE TPAMI, Vol. 30, No. 9, 1647–1658 (2008).CrossRefGoogle Scholar
  105. 105.
    E. S. Tellez, E. Chavez, and A. Camarena-Ibarrola, “A brief index for proximity searching,” in: Proc. CIARP 09 (2009), pp. 529–536.Google Scholar
  106. 106.
    G. Amato, C. Gennaro, and P. Savino, “Mi-file: Using inverted files for scalable approximate similarity search,” Multimed. Tools Appl., Vol. 71, No. 3, 1333–1362 (2014).CrossRefGoogle Scholar
  107. 107.
    A. Esuli, “Use of permutation prefixes for efficient and scalable approximate similarity search,” Information Processing & Management, Vol. 48, No. 5 889–902 (2012).CrossRefGoogle Scholar
  108. 108.
    E. S. Tellez, E. Chavez, and G. Navarro, “Succinct nearest neighbor search,” Information Systems, Vol. 38, No. 7, 1019–1030 (2013).CrossRefGoogle Scholar
  109. 109.
    E. Chavez, M. Graff, G. Navarro, and E. Tellez, “Near neighbor searching with K nearest references,” Information Systems, Vol. 51, 43–61 (2015).Google Scholar
  110. 110.
    B. Naidan, L. Boytsov, and E. Nyberg, “Permutation search methods are efficient, yet faster search is possible,” in: Proc. VLDB Endowment, Vol. 8, No. 12, 1618–1629 (2015).Google Scholar
  111. 111.
    N. Goyal, Y. Lifshits, and H. Schutze, “Disorder inequality: A combinatorial approach to nearest neighbor search,” in: Proc. WSDM 08 (2008), pp. 25–32.Google Scholar
  112. 112.
    Y. Lifshits and S. Zhang, “Combinatorial algorithms for nearest neighbors, near-duplicates and small world design,” in: Proc. SODA 09 (2009), pp. 318–326.Google Scholar
  113. 113.
    D. Tschopp, S. N. Diggavi, P. Delgosha, and S. Mohajer, “Randomized algorithms for comparison-based search,” in: Proc. NIPS 11 (2011), pp. 2231–2239.Google Scholar
  114. 114.
    M. E. Houle and J. Sakuma, “Fast approximate similarity search in extremely high-dimensional data sets,” in: Proc. ICDE 05 (2005), pp. 619–630.Google Scholar
  115. 115.
    M. E. Houle and M. Nett, “Rank-based similarity search: Reducing the dimensional dependence,” IEEE TPAMI, Vol. 37, No. 1, 136–150 (2015).CrossRefGoogle Scholar
  116. 116.
    S. Arya and D. M. Mount, “Approximate nearest neighbor queries in fixed dimensions,” in: Proc. SODA 93 (1993), pp. 271–280.Google Scholar
  117. 117.
    T. Sebastian and B. Kimia, “Metric-based shape retrieval in large databases,” in: Proc. ICPR 02, Vol. 3 (2002), pp. 291–296.Google Scholar
  118. 118.
    R. Paredes and E. Chavez, “Using the k-nearest neighbor graph for proximity searching in metric spaces,” in: Proc. SPIRE 05 (2005), pp. 127–138.Google Scholar
  119. 119.
    K. Hajebi, Y. Abbasi-Yadkori, H. Shahbazi, and H. Zhang, “Fast approximate nearest-neighbor search with K-nearest neighbor graph,” in: Proc. IJCAI 11 (2011), pp. 1312–1317.Google Scholar
  120. 120.
    Y. Malkov, A. Ponomarenko, A. Logvinov, and V. Krylov, “Scalable distributed algorithm for approximate nearest neighbor search problem in high dimensional general metric spaces,” in: Proc. SISAP 12 (2012), pp. 132–147.Google Scholar
  121. 121.
    Y. Malkov, A. Ponomarenko, A. Logvinov, and V. Krylov, “Approximate nearest neighbor algorithm based on navigable small world graphs,” Information Systems, Vol. 45, 61–68 (2014).Google Scholar
  122. 122.
    B. Harwood and T. Drummond, “FANNG: Fast approximate nearest neighbor graphs,” in: Proc. CVPR 16 (2016), pp. 5713–5722.Google Scholar
  123. 123.
    R. Paredes, E. Chavez, K. Figueroa, and G. Navarro, “Practical construction of k-nearest neighbor graphs in metric spaces,” in: Proc. WEA 06 (2006), pp. 85–97.Google Scholar
  124. 124.
    W. Dong, M. Charikar, and K. Li, “Efficient K-nearest neighbor graph construction for generic similarity measures,” in: Proc. WWW 11 (2011), pp. 577–586.Google Scholar
  125. 125.
    K. Aoyama, K. Saito, H. Sawada, and N. Ueda, “Fast approximate similarity search based on degree-reduced neighborhood graphs,” in: Proc. KDD 11 (2011), pp. 1055–1063.Google Scholar
  126. 126.
    W. Li, Y. Zhang, Y. Sun, W. Wang, W. Zhang, and X. Lin, Approximate Nearest Neighbor Search on High Dimensional Data — Experiments, Analyses, and Improvement. arXiv:1610.02455. 8 Oct 2016.Google Scholar
  127. 127.
    D. J. Watts and S. H. Strogatz, “Collective dynamics of small-world networks,” Nature, Vol. 393, No. 6684, 440–442 (1998).CrossRefGoogle Scholar
  128. 128.
    J. Kleinberg, “The small-world phenomenon: An algorithmic perspective,” in: Proc. STOC 00 (2000), pp. 163–170.Google Scholar
  129. 129.
    F. R. K. Chung, “Diameters of graphs: Old problems and new results,” Congr. Numer., Vol. 60, 295–317 (1987).MathSciNetGoogle Scholar
  130. 130.
    D. Achlioptas and P. Siminelakis, “Navigability is a robust property,” in: Proc. WAW 15 (2015), pp. 78–91.Google Scholar
  131. 131.
    P. Fraigniaud and G. Giakkoupis, “On the searchability of small-world networks with arbitrary underlying structure,” in: Proc. STOC 10 (2010), pp. 389–398.Google Scholar
  132. 132.
    P. Fraigniaud, E. Lebhar, and Z. Lotker, “A lower bound for network navigability,” SIAM Journal on Discrete Mathematics, Vol. 24, No. 1, 72–81 (2010).MathSciNetCrossRefMATHGoogle Scholar
  133. 133.
    P. Fraigniaud, C. Gavoille, A. Kosowski, E. Lebhar, and Z. Lotker, “Universal augmentation schemes for network navigability: Overcoming the \( \sqrt{n} \)-barrier,” in: Proc. SPAA 07 (2007), pp. 1–7.Google Scholar
  134. 134.
    G. Ruiz, E. Chavez, M. Graff, and E. S. Tellez, “Finding near neighbors through local search,” in: Proc. SISAP 15 (2015), pp. 103–109.Google Scholar
  135. 135.
    A. Ponomarenko, N. Avrelin, B. Naidan, and L. Boytsov, “Comparative analysis of data structures for approximate nearest neighbor search,” in: Proc. Data Analytics 14 (2014), pp. 125–130.Google Scholar
  136. 136.
    Yu. A. Malkov and D. A. Yashunin, Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs. arXiv:1603.09320. 21 May, 2016.Google Scholar
  137. 137.
    A. Sokolov, “Vector representations for efficient comparison and search for similar strings,” Cybernetics and Systems Analysis, Vol. 43, No. 4, 484–498 (2007).MathSciNetCrossRefMATHGoogle Scholar
  138. 138.
    A. Sokolov, “Investigation of accelerated search for close text sequences with the help of vector representations,” Cybernetics and Systems Analysis, Vol. 44, No. 4, 493–506 (2008).MathSciNetCrossRefMATHGoogle Scholar
  139. 139.
    M. Charikar, “Similarity estimation techniques from rounding algorithms,” in: Proc. STOC 02 (2002), pp. 380–388.Google Scholar
  140. 140.
    E. S. Tellez and E. Chavez, “On locality sensitive hashing in metric spaces,” in: Proc. SISAP 10 (2010), pp. 67–74.Google Scholar
  141. 141.
    V. Athitsos, M. Potamias, P. Papapetrou, and G. Kollios, “Nearest neighbor retrieval using distance-based hashing,” in: Proc. ICDE 08 (2008), pp. 327–336.Google Scholar
  142. 142.
    P. Jangyodsuk, P. Papapetrou, and V. Athitsos, “Optimizing hashing functions for similarity indexing in arbitrary metric and nonmetric spaces,” in: Proc. SDM 15 (2015), pp. 828–836.Google Scholar
  143. 143.
    J. M. Andrade, C. A. Astudillo, and R. Paredes, “Metric space searching based on random bisectors and binary fingerprints,” in: Proc. SISAP 14 (2014), pp. 50–57.Google Scholar
  144. 144.
    B. Kang and K. Jung, “Robust and efficient locality sensitive hashing for nearest neighbor search in large data sets,” in: Proc. BigLearn 12 (2012), pp. 1–8.Google Scholar
  145. 145.
    E. S. Silva, T. S. F. X. Teixeira, G. Teodoro, and E. Valle, “Large-scale distributed locality-sensitive hashing for general metric data,” in: Proc. SISAP 14 (2014), pp. 82–93.Google Scholar
  146. 146.
    D. Novak, M. Kyselak, and P. Zezula, “On locality-sensitive indexing in generic metric spaces,” in: Proc. SISAP 10 (2010), pp. 59–66.Google Scholar
  147. 147.
    A. Becker, L. Ducas, N. Gama, and T. Laarhoven, “New directions in nearest neighbor searching with applications to lattice sieving,” in: Proc. SODA 16 (2016), pp. 10–24.Google Scholar
  148. 148.
    ANN benchmark, http://github.com/erikbern/ann-benchmarks. Accessed 12 Apr. 2017.

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.International Scientific-Educational Center of Information Technologies and Systems, NAS of Ukraine and MES of UkraineKyivUkraine

Personalised recommendations