# Distance-Based Index Structures for Fast Similarity Search

- 68 Downloads
- 3 Citations

## Abstract

This review considers the class of index structures for fast similarity search. In constructing and applying such structures, only information on values or ranks of some distances/similarities between objects is used. The search by metric distances (satisfying the triangle inequality and other metric axioms) and by nonmetric distances is discussed. Structures that return objects of a base that represent the exact answer to a search query and also structures for approximate similarity search are presented (the latter structures do not guarantee precision, but usually return results close to exact and operate faster than structures for exact search). General principles of construction and application of some index structures are stated, and also ideas underlying concrete algorithms (both well-known and proposed lately) are considered.

## Keywords

similarity search nearest neighbor search index structure distance-based indexing metric distance nonmetric distance metric tree neighborhood graph branch and bound method## Preview

Unable to display preview. Download preview PDF.

## References

- 1.R. Datta, D. Joshi, J. Li, and J. Wang, “Image retrieval: Ideas, influences, and trends of the new age,” ACM Computing Surveys, Vol. 40, No. 2, 1–60 (2008).CrossRefGoogle Scholar
- 2.C. Manning, P. Raghavan, and H. Schutze, Introduction to Information Retrieval, Cambridge University Press, New York (2008).CrossRefMATHGoogle Scholar
- 3.R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd Edition, Wiley, New York (2001).MATHGoogle Scholar
- 4.R. Lopez De Mantaras, D. Mcsherry, D. Bridge, D. Leake, B. Smyth, S. Craw, B. Faltings, M. L. Maher, M. T. Cox, K. Forbus, M. Keane, A. Aamodt, and I. Watson, “Retrieval, reuse, revision and retention in case-based reasoning,” Knowledge Engineering Review. Vol. 20, No. 3, 215–240 (2005).CrossRefGoogle Scholar
- 5.M. G. Voskoglou and A.-B. M. Salem, “Analogy-based and case-based reasoning: Two sides of the same coin,” IJAFSAI, Vol. 4, 5–51 (2014).Google Scholar
- 6.C. M. Wharton, K. J. Holyoak, P. E. Downing, T. E. Lange, T. D. Wickens, and E. R. Melz, “Below the surface: Analogical similarity and retrieval competition reminding,” Cognitive Psychology, Vol. 26, 64–101 (1994).Google Scholar
- 7.D. Gentner and L. Smith, “Analogical reasoning,” in: V. S. Ramachandran (ed.), Encyclopedia of Human Behavior, Vol. 1, 2nd ed., Elsevier, Oxford, UK (2012), pp 130–136.Google Scholar
- 8.D. A. Rachkovskij and S. V. Slipchenko, “Similarity-based retrieval with structure-sensitive sparse binary distributed representations,” Computational Intelligence, Vol. 28, No. 1, 106–129 (2012).MathSciNetCrossRefGoogle Scholar
- 9.K. Forbus, R. Ferguson, A. Lovett, and D. Gentner, “Extending SME to handle large-scale cognitive modeling,” DOI: 10.1111/cogs.12377 (2016).Google Scholar
- 10.D. A. Rachkovskij, “Real-valued embeddings and sketches for fast distance and similarity estimation,” Cybernetics and Systems Analysis, Vol. 52, No. 6, 967-988 (2016).MathSciNetCrossRefMATHGoogle Scholar
- 11.D. A. Rachkovskij, “Binary vectors for fast distance and similarity estimation,” Cybernetics and Systems Analysis, Vol. 53, No. 1, 138–156 (2017)MathSciNetCrossRefMATHGoogle Scholar
- 12.E. Chavez, G. Navarro, R. Baeza-Yates, and J. L. Marroquin, “Searching in metric spaces,” ACM Computing Surveys, Vol. 33, No. 3, 273–321 (2001).CrossRefGoogle Scholar
- 13.G. R. Hjaltason and H. Samet, “Index-driven similarity search in metric spaces,” ACM Transactions on Database Systems, Vol. 28, No. 4, 517–580 (2003).CrossRefGoogle Scholar
- 14.H. Samet, Foundations of Multidimensional and Metric Data Structures, Morgan Kaufmann, San Francisco (2006).MATHGoogle Scholar
- 15.P. Zezula, G. Amato, V. Dohnal, and M. Batko, Similarity Search: The Metric Space Approach, Springer, New York (2006).MATHGoogle Scholar
- 16.A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions,” Communications of the ACM, Vol. 51, No. 1, 117–122 (2008).CrossRefGoogle Scholar
- 17.A. Andoni and P. Indyk, “Nearest neighbors in high-dimensional spaces,” in: Handbook of Discrete and Computational Geometry, Ch. 43, 3rd ed. (to appear) (2017).Google Scholar
- 18.K. Fukunaga and P. M. Narendra, “A branch and bound algorithm for computing k-nearest neighbors,” IEEE Trans. Comput., Vol. C-24, No. 7, 750–753 (1975).CrossRefMATHGoogle Scholar
- 19.J. Lokoc and T. Skopal, “On applications of parameterized hyperplane partitioning,” in: Proc. SISAP 10 (2010), pp. 131–132.Google Scholar
- 20.L. Cayton, “Efficient Bregman range search” in: Proc. NIPS 09 (2009), pp. 243–251.Google Scholar
- 21.R. Connor, L. Vadicamo, F. A. Cardillo, and F. Rabitti, “Supermetric search with the four-point property,” in: Proc. SISAP 16 (2016), pp. 51–64.Google Scholar
- 22.G. R. Hjaltason and H. Samet, “Properties of embedding methods for similarity searching in metric spaces,” IEEE Trans. PAMI, Vol. 25, No. 5, 530–549 (2003).Google Scholar
- 23.K. Clarkson, “Nearest-neighbor searching and metric space dimensions,” in: Nearest-Neighbor Methods for Learning and Vision: Theory and Practice, MIT Press (2006), pp. 15–59.Google Scholar
- 24.R. Weber, H. J. Schek, and S. Blott, “A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces,” in: Proc. VLDB 98 (1998), pp. 194–205.Google Scholar
- 25.C. Bohm, S. Berchtold, and D. A. Keim, “Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases,” ACM Com. Surv., Vol. 33, No. 3, 322–373 (2001).CrossRefGoogle Scholar
- 26.K. Beyer, J. Goldstein, R. Ramakhrishnan, and U. Shaft, “When is ”nearest neighbor" meaningful?" in: Proc. ICDT 99 (1999), pp. 217–235.Google Scholar
- 27.U. Shaft and R. Ramakrishnan, “Theory of nearest neighbors indexability,” ACM Trans. Database Syst., Vol. 31, 814–838 (2006).CrossRefGoogle Scholar
- 28.I. Volnyansky and V. Pestov, “Curse of dimensionality in pivot based indices,” in: Proc. SISAP 09 (2009), pp. 39–46.Google Scholar
- 29.V. Pestov, “Indexability, concentration, and VC theory,” Journal of Discrete Algorithms, Vol. 13, 2–18 (2012).Google Scholar
- 30.F. Camastra, “Data dimensionality estimation methods: A survey,” Pattern Recogn., Vol. 6, No 12, 2945–2954 (2003).CrossRefMATHGoogle Scholar
- 31.C. Traina, R. F. Santos Filho, A. J. M. Traina, M. R. Vieira, and C. Faloutsos, “The Omni-family of all-purpose access methods: A simple and effective way to make similarity search more efficient,” VLDB Journal, Vol. 16, No. 4, 483–505 (2007).CrossRefGoogle Scholar
- 32.T. Skopal and B. Bustos, “On nonmetric similarity search problems in complex domains,” ACM Comput. Surveys, Vol. 43, No 4, 34:1–34:50 (2011).Google Scholar
- 33.R. Mao, W. L. Mirankerb, and D. P. Mirankerc, “Pivot selection: Dimension reduction for distance-based indexing,” J. Discrete Algorithms, Vol. 13, 32–46( 2012).Google Scholar
- 34.M. Patella and P. Ciaccia, “Approximate similarity search: A multi-faceted problem,” J. Discrete Algorithms, Vol. 7, No. 1, 36–48 (2009).MathSciNetCrossRefMATHGoogle Scholar
- 35.D. M. W. Powers, “Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation,” J. of Machine Learning Tech., Vol. 2, No. 1, 37–63 (2011).MathSciNetGoogle Scholar
- 36.M. Muja and D. G. Lowe, “Scalable nearest neighbor algorithms for high dimensional data,” IEEE TPAMI, Vol. 36, No. 11, 2227–2240 (2014).CrossRefGoogle Scholar
- 37.G. Navarro, “Analyzing metric space indices: What for?” in: Proc. SISAP 09 (2009), pp. 3–10.Google Scholar
- 38.E. Vidal, “An algorithm for finding nearest neighbors in (approximately) constant average time,” Patt. Recog. Lett., Vol. 4, No. 3, 145–157 (1986).CrossRefGoogle Scholar
- 39.E. Vidal, “New formulation and improvements of the nearest-neighbor approximating and eliminating search algorithm (AESA),” Patt. Recog. Lett., Vol. 15, No. 1, 1–7 (1994).CrossRefGoogle Scholar
- 40.K. Figueroa, E. Chavez, G. Navarro, and R. Paredes, “Speeding up spatial approximation search in metric spaces,” ACM Journal of Experimental Algorithmics, Vol. 14, 3.6.1–3.6.21 (2009).Google Scholar
- 41.L. Mico, J. Oncina, and E. Vidal, “A new version of the nearest-neighbor approximating and eliminating search (AESA) with linear preprocessing-time and memory requirements,” Patt. Recog. Lett., Vol. 15, No 1, 9–17 (1994).CrossRefGoogle Scholar
- 42.S. Nene and S. Nayar, “A simple algorithm for nearest neighbor search in high dimensions,” IEEE Trans. PAMI, Vol. 19, No. 9, 989–1003 (1997).CrossRefGoogle Scholar
- 43.E. Chavez, J. Marroquín, and R. Baeza-Yates, “Spaghettis: An array based algorithm for similarity queries in metric spaces,” in: Proc. SPIRE 99 (1999), pp. 38–46.Google Scholar
- 44.I. Munro, R. Raman, V. Raman, and S. S. Rao, “Succinct representations of permutations and functions,” Theor. Comput. Sci., Vol. 438, 74–88 (2012).MathSciNetCrossRefMATHGoogle Scholar
- 45.E. Chavez, U. Ruiz, and E. Tellez, “CDA: Succinct spaghetti,” in: Proc. SISAP 15 (2015), 54–64.Google Scholar
- 46.K. Tokoro, K. Yamaguchi, and S. Masuda, “Improvements of TLAESA nearest neighbor search algorithm and extension to approximation search,” in: Proc. ACSC 06 (2006), pp. 77–83.Google Scholar
- 47.G. Ruiz, F. Santoyo, E. Chavez, K. Figueroa, and E. Tellez, “Extreme pivots for faster metric indices,” in: Proc. SISAP 13 (2013), pp. 115–126.Google Scholar
- 48.J. K. Uhlmann, “Satisfying general proximity/similarity queries with metric trees,” Information Processing Letters, Vol. 40, No. 4, 175–179 (1991).CrossRefMATHGoogle Scholar
- 49.P. N. Yianilos, “Data structures and algorithms for nearest neighbor search in general metric spaces,” in: Proc. SODA 93 (1993), pp. 311–321.Google Scholar
- 50.T. Chiueh, “Content-based image indexing,” in: Proc. VLDB 94 (1994), pp. 582–593.Google Scholar
- 51.T. Bozkaya and M. Ozsoyoglu, “Indexing large metric spaces for similarity search queries,” ACM Trans. Datab. Syst., Vol. 24, No. 3, 361–404 (1999).CrossRefGoogle Scholar
- 52.A. W.-C. Fu, P. M.-S. Chan, Y.-L. Cheung, and Y. S. Moon, “Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances,” VLDB Journal, Vol. 9, No. 2, 154–173 (2000).CrossRefGoogle Scholar
- 53.P. Yianilos, “Excluded middle vantage point forests for nearest neighbor search,” in: DIMACS Implementation Challenge, ALENEX 1999. URL: http://citeseer.ist.psu.edu/.
- 54.I. Kalantari and G. Mcdonald, “A data structure and an algorithm for the nearest point problem,” IEEE Trans. Softw. Eng., Vol. 9, No. 5, 631–634 (1983).CrossRefMATHGoogle Scholar
- 55.F. Dehne and H. Noltemeier, “Voronoi trees and clustering problems,” Information Systems, Vol. 12, No. 2, 171–175 (1987).CrossRefGoogle Scholar
- 56.H. Noltemeier, K. Verbarg, and C. Zirkelbach, “Monotonous bisector* trees — A tool for efficient partitioning of complex scenes of geometric objects,” LNCS, Vol. 594, 186–203 (1992).Google Scholar
- 57.P. Ciaccia, M. Patella, and P. Zezula, “Mtree: An efficient access method for similarity search in metric spaces,” in: Proc. VLDB 97 (1997), pp. 426–435.Google Scholar
- 58.P. Zezula, P. Savino, G. Amato, and F. Rabitti, “Approximate similarity retrieval with M-trees,” VLDB Journal, Vol. 7, No. 4, 275–293 (1998).CrossRefGoogle Scholar
- 59.T. Skopal, J. Pokorny, and V. Snasel, “PM-tree: Pivoting metric tree for similarity search in multimedia databases,” in: Proc. ADBIS 04 (2004), pp. 99–114.Google Scholar
- 60.S. Jin, O. Kim, and W. Feng, “MX-tree: A double hierarchical metric index with overlap reduction,” in: Proc. ICCSA 13 (2013), pp. 574–589.Google Scholar
- 61.S. Brin, “Near neighbor search in large metric spaces,” in: Proc. VLDB 95 (1995), pp. 574–584.Google Scholar
- 62.K. Fredriksson, Geometric Near-Neighbor Access Tree (GNAT) Revisited. arXiv:1605.05944. 20 May 2016.Google Scholar
- 63.G. Navarro and R. Uribe, “Fully dynamic metric access methods based on hyperplane partitioning,” Information Systems, Vol. 36, No. 4, 734–747 (2011).CrossRefGoogle Scholar
- 64.R. Connor, “Reference point hyperplane trees,” in: Proc. SYSAP 16 (2016), pp. 65–78.Google Scholar
- 65.S. O Hara and B. A. Draper, “Are you using the right approximate nearest neighbor algorithm?” in: Proc. WACV 13 (2013), pp. 9–14.Google Scholar
- 66.D. Comer, “The ubiquitous B-tree,” ACM Comput. Surv., Vol. 11, 121–138 (1979).CrossRefMATHGoogle Scholar
- 67.D. Novak and M. Batko, “Metric Index: An efficient and scalable solution for precise and approximate similarity search,” Information Systems, Vol. 36, No. 4, 721–733 (2011).CrossRefGoogle Scholar
- 68.J. Lokoc, J. Mosko, P. Cech, and T. Skopal, “On indexing metric spaces using cut-regions,” Information Systems, Vol. 43, 1–19 (2014).Google Scholar
- 69.L. Chen, Y. Gao, X. Li, C. S. Jensen, and G. Chen, “Efficient metric indexing for similarity search,” in: Proc. ICDE 15 (2015), pp. 591–602.Google Scholar
- 70.G. Navarro, “Searching in metric spaces by spatial approximation,” VLDB Journal, Vol. 11, No. 1, 28–46 (2002).CrossRefGoogle Scholar
- 71.G. Navarro and N. Reyes, “Dynamic spatial approximation trees,” Journal of Experimental Algorithmics, Vol. 12, Article 1.5 (2009).Google Scholar
- 72.M. Barroso, N. Reyes, and R. Paredes, “Enlarging nodes to improve spatial approximation trees,” in: Proc. SISAP 10 (2010), pp. 41–48.Google Scholar
- 73.G. Navarro and N. Reyes, “New dynamic metric indices for secondary memory,” Information Systems, Vol. 59, 48–78 (2016).Google Scholar
- 74.E. Chavez, V. Luduena, N. Reyes, and P. Roggero, “Faster proximity searching with the distal SAT,” Information Systems, Vol. 59, 15–47 (2016).Google Scholar
- 75.A. Beygelzimer, S. Kakade, and J. C. Langford, “Cover trees for nearest neighbor,” in Proc. ICML 06 (2006), pp. 97–104.Google Scholar
- 76.R. R. Curtin, Improving Dual-Tree Algorithms, Ph.D. Thesis, Georgia Inst. Tech. (2015).Google Scholar
- 77.E. Chavez and G. Navarro, “A compact space decomposition for effective metric indexing,” Pattern Recognition Letters, Vol. 26, No. 9, 1363–1376 (2005).CrossRefGoogle Scholar
- 78.P. Roggero, N. Reyes, K. Figueroa, and R. Paredes, “List of clustered permutations in secondary memory for proximity searching,” J. of Com. Science Tech., Vol. 15, No. 2, 107–113 (2015).Google Scholar
- 79.A. Ponomarenko, N. Avrelin, B. Naidan, and L. Boytsov, “Comparative analysis of data structures for approximate nearest neighbor search,” DATA ANALYTICS 2014 (2014), pp. 125–130.Google Scholar
- 80.V. Dohnal, C. Gennaro, P. Savino, and P. Zezula, “D-index: Distance searching index for metric data sets,” Multimedia Tools and Applications, Vol. 21, No. 1, 9–33 (2003).CrossRefGoogle Scholar
- 81.L. Cayton, “Accelerating nearest neighbor search on manycore systems,” in: Proc. IPDPS 12 (2012), pp. 402–413.Google Scholar
- 82.E. S. Tellez, G. Ruiz, and E. Chavez, “Singleton indices for nearest neighbor search,” Information Systems, Vol. 60, 50–68 (2016).Google Scholar
- 83.D. J. Rosenkrantz, R. E. Stearns, and P. M. Lewis, “II. An analysis of several heuristics for the traveling salesman problem,” SIAM Journal on Computing, Vol. 6, No. 3, 563–581 (1977).MathSciNetCrossRefMATHGoogle Scholar
- 84.T. F. Gonzalez, “Clustering to minimize the maximum intercluster distance,” Theoretical Computer Science, Vol. 38, 293–306 (1985).Google Scholar
- 85.B. Bustos, G. Navarro, and E. Chavez, “Pivot selection techniques for proximity searching in metric spaces,” Pattern Recogn. Lett., Vol. 24, 2357–2366 (2003).CrossRefMATHGoogle Scholar
- 86.N. R. Brisaboa, A. Farina, O. Pedreira, and N. Reyes, “Similarity search using sparse pivots for efficient multimedia information retrieval,” in: Proc. ISM 06 (2006), pp. 881–888.Google Scholar
- 87.R. H. Van Leuken and R. C. Veltkamp, “Selecting vantage objects for similarity indexing,” ACM Trans. Multimedia Comput. Commun. Appl., Vol. 7, 16:1–16:18 (2011).Google Scholar
- 88.S.-H. Kim, D.-Y. Lee, and H.-G. Cho, “An eigenvalue-based pivot selection strategy for improving search efficiency in metric spaces,” in: Proc. BigComp 16 (2016), pp. 207–214.Google Scholar
- 89.A. Berman and L. G. Shapiro, “Selecting good keys for triangle-inequality-based pruning algorithms,” in: Proc. CAIVD 98 (1998), pp. 12–19.Google Scholar
- 90.J. Venkateswaran, T. Kahveci, C. M. Jermaine, and D. Lachwani, “Reference-based indexing for metric spaces with costly distance measures,” VLDB Journal, Vol. 17, No. 5, 1231–1251 (2008).CrossRefGoogle Scholar
- 91.R. Mao, P. Zhang, X. Li, L. Xi, and M. Lu, “Pivot selection for metric-space indexing,” Int. J. Mach. Learn. Cybern., Vol. 7, No. 2, 311–323 (2016).CrossRefGoogle Scholar
- 92.C. Celik, “Effective use of space for pivot-based metric indexing structures,” in: Proc. SISAP 08 (2008), pp. 113–120.Google Scholar
- 93.M. L. Hetland, T. Skopal, J. Lokoc, and C. Beecks, “Ptolemaic access methods: Challenging the reign of the metric space model,” Information Systems, Vol. 38, No. 7, 989–1006 (2013).CrossRefGoogle Scholar
- 94.M. L. Hetland, “Ptolemaic indexing,” JoCG, Vol. 6, No. 1, 165–184 (2015).MathSciNetMATHGoogle Scholar
- 95.R. Connor, L. Vadicamo, F. A. Cardillo, and F. Rabitti, “Supermetric search with the four-point property,” in: Proc. SISAP 16 (2016), pp. 51–64.Google Scholar
- 96.P. Ciaccia and M. Patella, “Searching in metric spaces with user-defined and approximate distances,” ACM Database Systems, Vol. 27, No. 4, 398–437 (2002).CrossRefGoogle Scholar
- 97.L. Chen and X. Lian, “Efficient similarity search in nonmetric spaces with local constant embedding,” IEEE TKDE, Vol. 20, No. 3, 321–336 (2008).Google Scholar
- 98.T. Skopal and J. Lokoc, “NM-tree: Flexible approximate similarity search in metric and non-metric spaces,” in: Proc. DEXA 08 (2008), pp. 312–325.Google Scholar
- 99.R. R. Curtin, P. Ram, and A. G. Gray, “Fast exact max-kernel search,” in: Proc. SDM 13 (2013), pp. 1–9.Google Scholar
- 100.E. Keogh and C. Ratanamahatana, “Exact indexing of dynamic time warping,” Knowledge and Information Systems, Vol. 7, No. 3, 358–386 (2005).CrossRefGoogle Scholar
- 101.Z. Zhang, B. C. Ooi, S. Parthasarathy, and A. K. H. Tung, “Similarity search on Bregman divergence: Towards non-metric indexing,” in: Proc. VLDB Endowment, Vol. 2(2009), pp. 13–24.Google Scholar
- 102.A. Abdullah, J. Moeller, and S. Venkatasubramanian, “Approximate Bregman near neighbors in sublinear time: Beyond the triangle inequality,” in: Proc. SCG 12 (2012), pp. 31–40.Google Scholar
- 103.G. Amato and P. Savino, “Approximate similarity search in metric spaces using inverted files,” in: Proc. InfoScale 08 (2008), pp. 28:1–28:10.Google Scholar
- 104.E. Chavez, K. Figueroa, and G. Navarro,“Effective proximity retrieval by ordering permutations,” IEEE TPAMI, Vol. 30, No. 9, 1647–1658 (2008).CrossRefGoogle Scholar
- 105.E. S. Tellez, E. Chavez, and A. Camarena-Ibarrola, “A brief index for proximity searching,” in: Proc. CIARP 09 (2009), pp. 529–536.Google Scholar
- 106.G. Amato, C. Gennaro, and P. Savino, “Mi-file: Using inverted files for scalable approximate similarity search,” Multimed. Tools Appl., Vol. 71, No. 3, 1333–1362 (2014).CrossRefGoogle Scholar
- 107.A. Esuli, “Use of permutation prefixes for efficient and scalable approximate similarity search,” Information Processing & Management, Vol. 48, No. 5 889–902 (2012).CrossRefGoogle Scholar
- 108.E. S. Tellez, E. Chavez, and G. Navarro, “Succinct nearest neighbor search,” Information Systems, Vol. 38, No. 7, 1019–1030 (2013).CrossRefGoogle Scholar
- 109.E. Chavez, M. Graff, G. Navarro, and E. Tellez, “Near neighbor searching with K nearest references,” Information Systems, Vol. 51, 43–61 (2015).Google Scholar
- 110.B. Naidan, L. Boytsov, and E. Nyberg, “Permutation search methods are efficient, yet faster search is possible,” in: Proc. VLDB Endowment, Vol. 8, No. 12, 1618–1629 (2015).Google Scholar
- 111.N. Goyal, Y. Lifshits, and H. Schutze, “Disorder inequality: A combinatorial approach to nearest neighbor search,” in: Proc. WSDM 08 (2008), pp. 25–32.Google Scholar
- 112.Y. Lifshits and S. Zhang, “Combinatorial algorithms for nearest neighbors, near-duplicates and small world design,” in: Proc. SODA 09 (2009), pp. 318–326.Google Scholar
- 113.D. Tschopp, S. N. Diggavi, P. Delgosha, and S. Mohajer, “Randomized algorithms for comparison-based search,” in: Proc. NIPS 11 (2011), pp. 2231–2239.Google Scholar
- 114.M. E. Houle and J. Sakuma, “Fast approximate similarity search in extremely high-dimensional data sets,” in: Proc. ICDE 05 (2005), pp. 619–630.Google Scholar
- 115.M. E. Houle and M. Nett, “Rank-based similarity search: Reducing the dimensional dependence,” IEEE TPAMI, Vol. 37, No. 1, 136–150 (2015).CrossRefGoogle Scholar
- 116.S. Arya and D. M. Mount, “Approximate nearest neighbor queries in fixed dimensions,” in: Proc. SODA 93 (1993), pp. 271–280.Google Scholar
- 117.T. Sebastian and B. Kimia, “Metric-based shape retrieval in large databases,” in: Proc. ICPR 02, Vol. 3 (2002), pp. 291–296.Google Scholar
- 118.R. Paredes and E. Chavez, “Using the k-nearest neighbor graph for proximity searching in metric spaces,” in: Proc. SPIRE 05 (2005), pp. 127–138.Google Scholar
- 119.K. Hajebi, Y. Abbasi-Yadkori, H. Shahbazi, and H. Zhang, “Fast approximate nearest-neighbor search with K-nearest neighbor graph,” in: Proc. IJCAI 11 (2011), pp. 1312–1317.Google Scholar
- 120.Y. Malkov, A. Ponomarenko, A. Logvinov, and V. Krylov, “Scalable distributed algorithm for approximate nearest neighbor search problem in high dimensional general metric spaces,” in: Proc. SISAP 12 (2012), pp. 132–147.Google Scholar
- 121.Y. Malkov, A. Ponomarenko, A. Logvinov, and V. Krylov, “Approximate nearest neighbor algorithm based on navigable small world graphs,” Information Systems, Vol. 45, 61–68 (2014).Google Scholar
- 122.B. Harwood and T. Drummond, “FANNG: Fast approximate nearest neighbor graphs,” in: Proc. CVPR 16 (2016), pp. 5713–5722.Google Scholar
- 123.R. Paredes, E. Chavez, K. Figueroa, and G. Navarro, “Practical construction of k-nearest neighbor graphs in metric spaces,” in: Proc. WEA 06 (2006), pp. 85–97.Google Scholar
- 124.W. Dong, M. Charikar, and K. Li, “Efficient K-nearest neighbor graph construction for generic similarity measures,” in: Proc. WWW 11 (2011), pp. 577–586.Google Scholar
- 125.K. Aoyama, K. Saito, H. Sawada, and N. Ueda, “Fast approximate similarity search based on degree-reduced neighborhood graphs,” in: Proc. KDD 11 (2011), pp. 1055–1063.Google Scholar
- 126.W. Li, Y. Zhang, Y. Sun, W. Wang, W. Zhang, and X. Lin, Approximate Nearest Neighbor Search on High Dimensional Data — Experiments, Analyses, and Improvement. arXiv:1610.02455. 8 Oct 2016.Google Scholar
- 127.D. J. Watts and S. H. Strogatz, “Collective dynamics of small-world networks,” Nature, Vol. 393, No. 6684, 440–442 (1998).CrossRefGoogle Scholar
- 128.J. Kleinberg, “The small-world phenomenon: An algorithmic perspective,” in: Proc. STOC 00 (2000), pp. 163–170.Google Scholar
- 129.F. R. K. Chung, “Diameters of graphs: Old problems and new results,” Congr. Numer., Vol. 60, 295–317 (1987).MathSciNetGoogle Scholar
- 130.D. Achlioptas and P. Siminelakis, “Navigability is a robust property,” in: Proc. WAW 15 (2015), pp. 78–91.Google Scholar
- 131.P. Fraigniaud and G. Giakkoupis, “On the searchability of small-world networks with arbitrary underlying structure,” in: Proc. STOC 10 (2010), pp. 389–398.Google Scholar
- 132.P. Fraigniaud, E. Lebhar, and Z. Lotker, “A lower bound for network navigability,” SIAM Journal on Discrete Mathematics, Vol. 24, No. 1, 72–81 (2010).MathSciNetCrossRefMATHGoogle Scholar
- 133.P. Fraigniaud, C. Gavoille, A. Kosowski, E. Lebhar, and Z. Lotker, “Universal augmentation schemes for network navigability: Overcoming the \( \sqrt{n} \)-barrier,” in: Proc. SPAA 07 (2007), pp. 1–7.Google Scholar
- 134.G. Ruiz, E. Chavez, M. Graff, and E. S. Tellez, “Finding near neighbors through local search,” in: Proc. SISAP 15 (2015), pp. 103–109.Google Scholar
- 135.A. Ponomarenko, N. Avrelin, B. Naidan, and L. Boytsov, “Comparative analysis of data structures for approximate nearest neighbor search,” in: Proc. Data Analytics 14 (2014), pp. 125–130.Google Scholar
- 136.Yu. A. Malkov and D. A. Yashunin, Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs. arXiv:1603.09320. 21 May, 2016.Google Scholar
- 137.A. Sokolov, “Vector representations for efficient comparison and search for similar strings,” Cybernetics and Systems Analysis, Vol. 43, No. 4, 484–498 (2007).MathSciNetCrossRefMATHGoogle Scholar
- 138.A. Sokolov, “Investigation of accelerated search for close text sequences with the help of vector representations,” Cybernetics and Systems Analysis, Vol. 44, No. 4, 493–506 (2008).MathSciNetCrossRefMATHGoogle Scholar
- 139.M. Charikar, “Similarity estimation techniques from rounding algorithms,” in: Proc. STOC 02 (2002), pp. 380–388.Google Scholar
- 140.E. S. Tellez and E. Chavez, “On locality sensitive hashing in metric spaces,” in: Proc. SISAP 10 (2010), pp. 67–74.Google Scholar
- 141.V. Athitsos, M. Potamias, P. Papapetrou, and G. Kollios, “Nearest neighbor retrieval using distance-based hashing,” in: Proc. ICDE 08 (2008), pp. 327–336.Google Scholar
- 142.P. Jangyodsuk, P. Papapetrou, and V. Athitsos, “Optimizing hashing functions for similarity indexing in arbitrary metric and nonmetric spaces,” in: Proc. SDM 15 (2015), pp. 828–836.Google Scholar
- 143.J. M. Andrade, C. A. Astudillo, and R. Paredes, “Metric space searching based on random bisectors and binary fingerprints,” in: Proc. SISAP 14 (2014), pp. 50–57.Google Scholar
- 144.B. Kang and K. Jung, “Robust and efficient locality sensitive hashing for nearest neighbor search in large data sets,” in: Proc. BigLearn 12 (2012), pp. 1–8.Google Scholar
- 145.E. S. Silva, T. S. F. X. Teixeira, G. Teodoro, and E. Valle, “Large-scale distributed locality-sensitive hashing for general metric data,” in: Proc. SISAP 14 (2014), pp. 82–93.Google Scholar
- 146.D. Novak, M. Kyselak, and P. Zezula, “On locality-sensitive indexing in generic metric spaces,” in: Proc. SISAP 10 (2010), pp. 59–66.Google Scholar
- 147.A. Becker, L. Ducas, N. Gama, and T. Laarhoven, “New directions in nearest neighbor searching with applications to lattice sieving,” in: Proc. SODA 16 (2016), pp. 10–24.Google Scholar
- 148.ANN benchmark, http://github.com/erikbern/ann-benchmarks. Accessed 12 Apr. 2017.