Advertisement

Cybernetics and Systems Analysis

, Volume 53, Issue 1, pp 138–156 | Cite as

Binary Vectors for Fast Distance and Similarity Estimation

  • D. A. Rachkovskij
NEW TOOLS OF CYBERNETICS, INFORMATICS, COMPUTER ENGINEERING, AND SYSTEMS ANALYSIS

Abstract

This review considers methods and algorithms for fast estimation of distance/similarity measures between initial data from vector representations with binary or integer-valued components obtained from initial data that are mainly high-dimensional vectors with different distance measures (angular, Euclidean, and others) and similarity measures (cosine, inner product, and others). Methods without learning that mainly use random projections with the subsequent quantization and also sampling methods are discussed. The obtained vectors can be applied in similarity search, machine learning, and other algorithms.

Keywords

distance similarity embedding sketch random projection sampling binarization quantization Johnson–Lindenstrauss lemma kernel similarity similarity search locality-sensitive hashing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    D. A. Rachkovskij, “Real-valued embeddings and sketches for fast distance and similarity estimation,” Cybernetics and Systems Analysis, Vol. 52, No. 6, 967–988 (2016).MathSciNetCrossRefGoogle Scholar
  2. 2.
    M. Deza and E. Deza, Encyclopedia of Distances, Springer, Berlin-Heidelberg (2016).CrossRefMATHGoogle Scholar
  3. 3.
    M.-J. Lesot, M. Rifqi, and H. Benhadda, “Similarity measures for binary and numerical data: A survey,” Int. J. Knowledge Engineering and Soft Data Paradigms, Vol. 1, No. 1, 63–84 (2009).CrossRefGoogle Scholar
  4. 4.
    S.-S. Choi, S.-H. Cha, and C. C. Tappert, “A survey of binary similarity and distance measures,” J. Systemics, Cybernetics and Informatics, Vol. 8, No. 1, 43–48 (2010).Google Scholar
  5. 5.
    W. B. Johnson and J. Lindenstrauss, “Extensions of Lipshitz mapping into Hilbert space,” Contemporary Mathematics, Vol. 26, 189–206 (1984).CrossRefMATHGoogle Scholar
  6. 6.
    P. Indyk and R. Motwani, “Approximate nearest neighbors: Towards removing the curse of dimensionality,” in: Proc. 30th ACM Symp. Theory of Computing (1998), pp. 604–613.Google Scholar
  7. 7.
    S. S. Vempala, The Random Projection Method, American Math. Soc., Providence, R.I. (2004).Google Scholar
  8. 8.
    J. Matousek, “On variants of the Johnson–Lindenstrauss lemma,” Random Structures and Algorithms, Vol. 33, No. 2, 142–156 (2008).MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    A. Andoni, R. Krauthgamer, and I. P. Razenshteyn, “Sketching and embedding are equivalent for norms,” in: Proc. STOC’15 (2015), pp. 479–488.Google Scholar
  10. 10.
    T. Batu, F. Ergun, and C. Sahinalp, “Oblivious string embeddings and edit distance approximations,” in: Proc. SODA‘06 (2006), pp. 792–801.Google Scholar
  11. 11.
    P. Indyk and A. Naor, “Nearest-neighbor-preserving embeddings,” ACM Trans. Algorithms, Vol. 3, No. 3, Article No. 31 (2007).Google Scholar
  12. 12.
    M. Goemans and D. Williamson, “Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming,” Journ. ACM, Vol. 42, No. 6, 1115–1145 (1995).MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    M. Charikar, “Similarity estimation techniques from rounding algorithms,” in: Proc. STOC’02, 380–388 (2002).Google Scholar
  14. 14.
    X. Yi, C. Caramanis, and E. Price, “Binary embedding: Fundamental limits and fast algorithm,” JMLR: W&CP, Vol. 37, 2162–2170 (2015).Google Scholar
  15. 15.
    G. S. Manku, A. Jain, and A. D. Sarma, “Detecting near-duplicates for web crawling,” in: Proc. WWW’07 (2007), pp. 141–150.Google Scholar
  16. 16.
    P. Li, T. J. Hastie, and K. W. Church, “Improving random projections using marginal information,” in: Proc. COLT’06 (2006), pp. 635–649.Google Scholar
  17. 17.
    F. X. Yu, A. Bhaskara, S. Kumar, Y. Gong, and S.-F. Chang, On Binary Embedding Using Circulant Matrices, arXiv:1511.06480 (2015).Google Scholar
  18. 18.
    D. A. Rachkovskij, I. S. Misuno, and S. V. Slipchenko, “Randomized projective methods for construction of binary sparse vector representations,” Cybernetics and Systems Analysis, Vol. 48, No. 1, 140–150 (2012).CrossRefMATHGoogle Scholar
  19. 19.
    D. A. Rachkovskij, “Estimation of vectors similarity by their randomized binary projections” Cybernetics and Systems Analysis, Vol. 51, No. 5, 808–818 (2015).CrossRefMATHGoogle Scholar
  20. 20.
    G. W. Oehlert, “A note on the delta method,” The American Statistician, Vol. 46, No. 1, 27–29 (1992).MathSciNetGoogle Scholar
  21. 21.
    L. Jacques, J. N. Laska, P. T. Boufounos, and R. G. Baraniuk, “Robust 1-Bit compressive sensing via binary stable embeddings of sparse vectors,” IEEE Trans. Inf. Theory, Vol. 59, No. 4, 2082–2102 (2013).MathSciNetCrossRefGoogle Scholar
  22. 22.
    L. Jacques, “A quantized Johnson–Lindenstrauss lemma: The finding of Buffon’s needle,” IEEE Trans. Inf. Theory, Vol. 61, No. 9, 5012–5027 (2015).MathSciNetCrossRefGoogle Scholar
  23. 23.
    D. E. Knuth, “Big omicron and big omega and big theta,” ACM Sigact News, Vol. 8, No. 2, 18–24 (1976).CrossRefGoogle Scholar
  24. 24.
    Z. Karnin, Y. Rabani, and A. Shpilka, “Explicit dimension reduction and its applications,” SIAM J. Comput., Vol. 41, No. 1, 219–249 (2012).MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    K. G. Larsen and J. Nelson, Optimality of the Johnson–Lindenstrauss Lemma, arXiv:1609.02094 (2016).Google Scholar
  26. 26.
    Y. Plan and R. Vershynin, “Dimension reduction by random hyperplane tessellations,” Discrete and Computational Geometry, Vol. 51, No. 2, 438–461 (2014).MathSciNetCrossRefMATHGoogle Scholar
  27. 27.
    S. Oymak and B. Recht, Near Optimal Bounds for Binary Embeddings of Arbitrary Sets, arXiv:1512.04433 (2015).Google Scholar
  28. 28.
    N. Ailon and B. Chazelle, “The Fast Johnson–Lindenstrauss transform and approximate nearest neighbors” SIAM J. Comput., Vol. 39, No. 1, 302–322 (2009).MathSciNetCrossRefMATHGoogle Scholar
  29. 29.
    Q. Le, T. Sarlos, and A. J. Smola, “Fastfood - Computing Hilbert space expansions in loglinear time,” JMLR: W&CP, Vol. 28, No. 3, pp. 244–252 (2013).Google Scholar
  30. 30.
    S. Oymak, Near-Optimal Sample Complexity Bounds for Circulant Binary Embedding, arXiv:1603.03178 (2016).Google Scholar
  31. 31.
    S.-H. Hsieh, C.-S. Lu, and S.-C. Pei, “Fast binary embedding via circulant downsampled matrix: A dataindependent approach,” in: Proc. ICIP’16 (2016).Google Scholar
  32. 32.
    A. Choromanska, K. Choromanski, M. Bojarski, T. Jebara, S. Kumar, and Y. LeCun, “Binary embeddings with structured hashed projections,” in: Proc. ICML’16 (2016), pp. 344–353.Google Scholar
  33. 33.
    S. Dirksen and A. Stollenwerk, Fast Binary Embeddings with Gaussian Circulant Matrices: Improved Bounds, arXiv:1608.06498 (2016).Google Scholar
  34. 34.
    P. Li, T. J. Hastie, and K. W. Church, “Very sparse random projections,” in: Proc. KDD’06 (2006), pp. 287–296.Google Scholar
  35. 35.
    D. A. Rachkovskij, “Formation of similarity-reflecting binary vectors with random binary projections,” Cybernetics and Systems Analysis, Vol. 51, No. 2, 313–323 (2015).CrossRefMATHGoogle Scholar
  36. 36.
    V. Korolev and I. Shevtsova, “An improvement of the Berry-Esseen inequality with applications to Poisson and mixed Poisson random sums,” Scandinavian Actuarial Journal, Vol. 2012, No. 2, 81–105 (2012).MathSciNetCrossRefMATHGoogle Scholar
  37. 37.
    Y. Gong, K. Sanjiv, H. A. Rowley, and S. Lazebnik, “Learning binary codes for highdimensional data using bilinear projections,” in: Proc. CVPR’13 (2013), pp. 484–491.Google Scholar
  38. 38.
    X. Zhang, F. X. Yu, R. Guo, S. Kumar, S. Wang, and S.-F. Chang, “Fast orthogonal projection based on kronecker product,” in: Proc. ICCV’15 (2015), pp. 2929–2937.Google Scholar
  39. 39.
    P. Indyk and R. Motwani, “Approximate nearest neighbors: Towards removing the curse of dimensionality,” in: Proc. 30th ACM Symp. Theory of Computing (1998), pp. 604–613.Google Scholar
  40. 40.
    A. Gionis, P. Indyk, and R. Motwani, “Similarity search in high dimensions via hashing,” in: Proc. VLDB’99 (1999), pp. 518–529.Google Scholar
  41. 41.
    A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions, Communications of the ACM, Vol. 51, No. 1, 117–122 (2008).CrossRefGoogle Scholar
  42. 42.
    A. Andoni, “Nearest neighbor search: The old, new, and the impossible,” PhD thesis, Massachusetts Institute of Technology (2009).Google Scholar
  43. 43.
    J. Wang, H. T. Shen, J. Song, and J. Ji, Hashing for Similarity Search: A survey, arXiv:1408.2927 (2014).Google Scholar
  44. 44.
    P. Li, M. Mitzenmacher, and A. Shrivastava, “Coding for random projections,” in: Proc. ICML’14 (2014), pp. 676–684.Google Scholar
  45. 45.
    S. Shalev-Shwartz, Y. Singer, and N. Srebro, “Pegasos: Primal estimated sub-gradient solver for SVM,” in: Proc. ICML’2007 (2007), pp. 807–814.Google Scholar
  46. 46.
    R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, “LIBLINEAR: A library for large linear classification,” Journal of Machine Learning Research, Vol. 9, 1871–1874 (2008).MATHGoogle Scholar
  47. 47.
    T. Joachims, T. Finley, and C.-N. J. Yu, “Cutting-plane training of structural SVMs,” Machine Learning, Vol. 77, No. 1, 27–59 (2009).CrossRefMATHGoogle Scholar
  48. 48.
    T. Martinetz, K. Labusch, and D. Schneegass, “SoftDoubleMaxMinOver: Perceptron-like training of Support Vector Machines,” IEEE Transactions on Neural Networks, Vol. 20, No. 7, 1061–1072 (2009).CrossRefGoogle Scholar
  49. 49.
    L. Bottou, “Large-scale machine learning with stochastic gradient descent,” in: Proc. COMPSTAT’10 (2010), pp. 177–187.Google Scholar
  50. 50.
    M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni, “Locality-sensitive hashing scheme based on p-stable distributions,” in: Proc. SCG’04 (2004), pp. 253–262.Google Scholar
  51. 51.
    P. Li, M. Mitzenmacher, and A. Shrivastava, 2-Bit Random Projections, Nonlinear Estimators, and Approximate Near Neighbor Search, arXiv:1602.06577 (2016).Google Scholar
  52. 52.
    D. Gorisse, M. Cord, and F. Precioso, “Locality-sensitive hashing for chi2 distance,” IEEE Ttrans. PAMI, Vol. 34, No. 2, 402–409 (2012).CrossRefGoogle Scholar
  53. 53.
    P. Li, G. Samorodnitsky, and J. Hopcroft, “Sign cauchy projections and chi-square kernel,” in: Proc. NIPS’13, 2571–2579 (2013).Google Scholar
  54. 54.
    P. Li, Sign Stable Random Projections for Large-Scale Learning, arXiv:1504.07235 (2015).Google Scholar
  55. 55.
    A. Dasgupta, R. Kumar, and T. Sarlos, “Fast locality sensitive hashing,” in: Proc. SIGKDD’11 (2011), pp. 1073–1081.Google Scholar
  56. 56.
    L. Pauleve, H. Jegou, and L. Amsaleg, “Locality sensitive hashing: A comparison of hash function types and querying mechanisms,” Pattern Recognit. Lett., Vol. 31, No. 11, 1348–1358 (2010).CrossRefGoogle Scholar
  57. 57.
    P. Li, “0-bit consistent weighted sampling,” in: Proc. KDD’15 (2015), pp. 665-674.Google Scholar
  58. 58.
    P. Li, A Comparison Study of Nonlinear Kernels, arXiv:1603.06541. (2016).Google Scholar
  59. 59.
    M. Manasse, F. McSherry, and K. Talwar, “Consistent weighted sampling,” Tech. Rep. MSR-TR-2010-73 (2010).Google Scholar
  60. 60.
    S. Ioffe, “Improved consistent sampling, weighted minhash and L1 sketching,” in: Proc. ICDM’10 (2010), pp. 246–255.Google Scholar
  61. 61.
    B. Haeupler, M. Manasse, and K. Talwar, Consistent Weighted Sampling Made Fast, Small, and Easy, arXiv:1410.4266 (2014).Google Scholar
  62. 62.
    A. Shrivastava, “Simple and efficient weighted minwise hashing,” in: Proc. NIPS’16 (2016).Google Scholar
  63. 63.
    M. Thorup, “Bottom-k and priority sampling, set similarity and subset sums with minimal independence,” in: Proc. STOC’13 (2013), pp. 371–378.Google Scholar
  64. 64.
    P. Li, Generalized Min-Max Kernel and Generalized Consistent Weighted Sampling, arXiv:1605.05721 (2016).Google Scholar
  65. 65.
    P. Li and C.-H. Zhang, Theory of the GMM Kernel, arXiv:1608.00550 (2016).Google Scholar
  66. 66.
    P. Li, Nystrom Method for Approximating the GMM Kernel, arXiv:1607.03475 (2016).Google Scholar
  67. 67.
    N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press, Cambridge, UK (2000).CrossRefMATHGoogle Scholar
  68. 68.
    I. Steinwart and A. Christmann, Support Vector Machines, Springer, New York (2008).MATHGoogle Scholar
  69. 69.
    T. Hofmann, B. Scholkopf, and A. Smola, “Kernel methods in machine learning,” Annals of Statistics, Vol. 36, No. 3, 1171–1220 (2008).MathSciNetCrossRefMATHGoogle Scholar
  70. 70.
    N. Shervashidze, P. Schweitzer, E. J. van Leeuwen, K. Mehlhorn, and K. M. Borgwardt, “Weisfeiler–Lehman graph kernels,” J. of Machine Learning Research, Vol. 2, 2539–2561 (2011).MathSciNetMATHGoogle Scholar
  71. 71.
    M. M. Luqman, J. Y. Ramel, J. Llados, and T. Brouard, “Fuzzy multilevel graph embedding,” Pattern Recognition, Vol. 46, No. 2, 551–565 (2013).CrossRefMATHGoogle Scholar
  72. 72.
    L. Livi, A. Rizzi, and A. Sadeghian, “Optimized dissimilarity space embedding for labeled graphs,” Information Sciences, Vol. 266, 47–64 (2014).MathSciNetCrossRefGoogle Scholar
  73. 73.
    M. Neumann, R. Garnett, C. Bauckhage, and K. Kersting, “Propagation kernels: Efficient graph kernels from propagated information,” Machine Learning, Vol. 102, No. 2, 209–245 (2016).MathSciNetCrossRefMATHGoogle Scholar
  74. 74.
    T. Gartner, J. Lloyd, and P. Flach, “Kernels and distances for structured data,” Machine Learning, Vol. 57, No. 3, 205–232 (2004).CrossRefMATHGoogle Scholar
  75. 75.
    K. Shin and T. Kuboyama, “A generalization of Haussler’s convolution kernel — Mapping kernel and its application to tree kernels,” J. Comput. Sci. Technol., Vol. 25, No. 5, 1040–1054 (2010).MathSciNetCrossRefGoogle Scholar
  76. 76.
    G. Da San Martino, N. Navarin, and A. Sperduti, “A tree-based kernel for graphs,” in: Proc. ICDM’12 (2012), pp. 975–986.Google Scholar
  77. 77.
    N. Kriege and P. Mutzel, “Subgraph matching kernels for attributed graphs,” in: Proc. ICML’12 (2012), pp. 1015–1022.Google Scholar
  78. 78.
    A. Rahimi and B. Recht, “Random features for large-scale kernel machines,” in: Proc. NIPS’07 (2007), pp. 1177–1184.Google Scholar
  79. 79.
    M. Raginsky and S. Lazebnik, “Locality-sensitive binary codes from shift invariant kernels,” in: Proc. NIPS’09 (2009), pp. 1509–1517.Google Scholar
  80. 80.
    S. Kim and S. Choi, “Bilinear random projections for locality-sensitive binary codes,” in: Proc. CVPR’15 (2015), pp.1338–1346.Google Scholar
  81. 81.
    B. Kulis and K. Grauman, “Kernelized locality-sensitive hashing,” IEEE Trans. PAMI, Vol. 34, No. 6, 1092–1104 (2012).CrossRefGoogle Scholar
  82. 82.
    K. Jiang, Q. Que, and B. Kulis, “Revisiting kernelized locality-sensitive hashing for improved large-scale image retrieval,” in: Proc. CVPR’15 (2015), pp. 4933–4941.Google Scholar
  83. 83.
    H. Xia, P. Wu, S. C. Hoi, and R. Jin, “Boosting multi-kernel locality-sensitive hashing for scalable image retrieval,” in: Proc. SIGIR’12 (2012), pp. 55–64.Google Scholar
  84. 84.
    P. Li, A. Shrivastava, J. L. Moore, and A. C. König, “Hashing algorithms for large-scale learning,” in: Proc. NIPS’11 (2011), pp. 2672–2680.Google Scholar
  85. 85.
    P. Li and A.C. König, “Theory and applications of b-bit minwise hashing,” Communications of the ACM, Vol. 54, No. 8, 101–109 (2011).CrossRefGoogle Scholar
  86. 86.
    E. Kushilevitz, R. Ostrovsky, and Y. Rabani, “Efficient search for approximate nearest neighbor in high dimensional spaces,” SIAM Journal on Computing, Vol. 30, No. 2, 457–474 (2000).MathSciNetCrossRefMATHGoogle Scholar
  87. 87.
    P. Li and K. W. Church, “A sketch algorithm for estimating two-way and multi-way associations,” Computational Linguistics, Vol. 33, No. 3, 305–354 (2007).CrossRefMATHGoogle Scholar
  88. 88.
    P. Li, K. W. Church, and T. J. Hastie, “One sketch for all: Theory and applications of conditional random sampling,” in: Proc. NIPS’08 (2008), pp. 953–960.Google Scholar
  89. 89.
    P. Flajolet and G. N. Martin, “Probabilistic counting algorithms for data base applications,” J. Comput. System Sci., Vol. 31, 182–209 (1985).MathSciNetCrossRefMATHGoogle Scholar
  90. 90.
    E. Cohen, “Size-estimation framework with applications to transitive closure and reachability,” J. Comput. System Sci., Vol. 55, 441–453 (1997).MathSciNetCrossRefMATHGoogle Scholar
  91. 91.
    E. Cohen, “All-distances sketches, revisited: HIP estimators for massive graphs analysis,” in: Proc. PODS’14 (2014), pp. 88-99.Google Scholar
  92. 92.
    A. Z. Broder, “On the resemblance and containment of documents,” in: Proc. SEQUENCES’97 (1997), pp. 21–29.Google Scholar
  93. 93.
    A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig, “Syntactic clustering of the web,” Computer Networks and ISDN Systems, Vol. 29, Nos. 8–13, 1157–1166 (1997).CrossRefGoogle Scholar
  94. 94.
    A. Z. Broder, M. Charikar, A. M. Frieze, and M. Mitzenmacher, “Min-wise independent permutations,” J. Comput. System Sci., Vol. 60, 327–336 (1998).MathSciNetMATHGoogle Scholar
  95. 95.
    M. Mitzenmacher, R. Pagh, and N. Pham, “Efficient estimation for high similarities using odd sketches,” in: Proc. WWW’14 (2014), pp. 109–118.Google Scholar
  96. 96.
    P. Indyk, “A small approximately min-wise independent family of hash functions,” Journal of Algorithms, Vol. 38, No. 1, 84–90 (2001).MathSciNetCrossRefMATHGoogle Scholar
  97. 97.
    M. Patrascu and M. Thorup, “On the k-independence required by linear probing and minwise independence,” ACM Trans. Algorithms, Vol. 12, No. 1, 8:1–8:27 (2016).Google Scholar
  98. 98.
    S. Dahlgaard and M. Thorup, “Approximately minwise independence with twisted tabulation,” in: Proc. SWAT’14 (2014), pp. 134–145.Google Scholar
  99. 99.
    M. Thorup, “Fast and Powerful Hashing Using Tabulation, arXiv:1505.01523. (2016).Google Scholar
  100. 100.
    M. Mitzenmacher and S. Vadhan, “Why simple hash functions work: Exploiting the entropy in a data stream,” in: Proc. SODA’08 (2008), pp. 746–755.Google Scholar
  101. 101.
    P. Li, A. B. Owen, and C.-H. Zhang, “One permutation hashing,” in: Proc. NIPS’12 (2012), pp. 3122–3130.Google Scholar
  102. 102.
    M. Charikar, K. Chen, and M. Farach-Colton, “Finding frequent items in data streams,” in: Proc. ICALP’02 (2002), pp. 693–703.Google Scholar
  103. 103.
    P. Flajolet, É. Fusy, O. Gandouet, and F. Meunier, “Hyperloglog: The analysis of a near-optimal cardinality estimation algorithm,” in: Proc. AofA’07 (2007), pp. 127–146.Google Scholar
  104. 104.
    A. Shrivastava and P. Li, “Densifying one permutation hashing via rotation for fast near neighbor search,” in: Proc. ICML’14 (2014), pp. 557–565.Google Scholar
  105. 105.
    A. Shrivastava and P. Li, “Improved densification of one permutation hashing,” in: Proc. UAI’14 (2014), pp. 732–741.Google Scholar
  106. 106.
    S. Dahlgaard, M. B. T. Knudsen, E. Rotenberg, and M. Thorup, “Hashing for statistics over k-partitions,” in: Proc. FOCS’15 (2015), pp. 1292–1310.Google Scholar
  107. 107.
    D. Valsesia, S. M. Fosson, C. Ravazzi, T. Bianchi, and E. Magli, “SparseHash: Embedding Jaccard coefficient between supports of signals,” in: ICME 2016 Workshops (2016), pp. 1–16.Google Scholar
  108. 108.
    E. M. Kussul, D. A. Rachkovskij, and T. N. Baidyk, “Associative-projective neural networks: Architecture, implementation, applications,” in: Proc. Neuro-Nimes’91 (1991), pp. 463–476.Google Scholar
  109. 109.
    D. A. Rachkovskij, E. M. Kussul, and T. N. Baidyk, “Building a world model with structure-sensitive sparse binary distributed representations,” Biologically Inspired Cognitive Architectures, Vol. 3, pp. 64–86 (2013).CrossRefGoogle Scholar
  110. 110.
    D. Kleyko, E. Osipov, and D. A. Rachkovskij, “Modification of holographic graph neuron using sparse distributed representations,” in: Procedia Computer Science, Vol. 88, 39–45 (2016).Google Scholar
  111. 111.
    A. Kartashov, A. Frolov, A. Goltsev, and R. Folk, “Quality and efficiency of retrieval for Willshaw-like autoassociative networks: III. Willshaw–Potts model,” Network: Computation in Neural Systems, Vol. 8, No. 1, 71–86 (1997).CrossRefMATHGoogle Scholar
  112. 112.
    A. A. Frolov, D. A. Rachkovskij, and D. Husek, “On information characteristics of Willshaw-like auto-associative memory,” Neural Network World, Vol. 12, No. 2, 141–158 (2002).Google Scholar
  113. 113.
    A. A. Frolov, D. Husek, and D. A. Rachkovskij, “Time of searching for similar binary vectors in associative memory,” Cybernetics and Systems Analysis, Vol. 42, No. 5, 615–623 (2006).CrossRefMATHGoogle Scholar
  114. 114.
    K. Eshghi and M. Kafai, “Support Vector Machines with sparse binary high-dimensional feature vectors,” HPE-2016-30 (2016).Google Scholar
  115. 115.
    N. M. Amosov, T. N. Baidyk, A. D. Goltsev, A. M. Kasatkin, L. M. Kasatkina, E. M. Kussul, and D. A. Rachkovskij, Neurocomputers and Intelligent Robots [in Russian], Naukova Dumka, Kyiv (1991).Google Scholar
  116. 116.
    E. M. Kussul, D. A. Rachkovskij, and T. N. Baidyk, “On image texture recognition by an associative-projective neurocomputer,” in: Proc. ANNIE’91 (1991), pp. 453-458.Google Scholar
  117. 117.
    R. Donaldson, A. Gupta, Y. Plan, and T. Reimer, Random Mappings Designed for Commercial Search Engines, arXiv:1507.05929 (2015).Google Scholar
  118. 118.
    B. A. Olshausen and D. J. Field, “Sparse coding of sensory inputs,” Curr. Opin. Neurobiol., Vol. 14, 481–487 (2004).CrossRefGoogle Scholar
  119. 119.
    S. Ahmad and J. Hawkins, How Do Neurons Operate on Sparse Distributed Representations? A Mathematical Theory of Sparsity, Neurons and Active Dendrites, arXiv:1601.00720 (2016).Google Scholar
  120. 120.
    I. S. Misuno, D. A. Rachkovskij, and S. V. Slipchenko, “Vector and distributed representations reflecting semantic relatedness of words,” Mathematical Machines and Systems, No. 3, 50–67 (2005).Google Scholar
  121. 121.
    I. S. Misuno, D. A. Rachkovskij, S. V. Slipchenko, and A. M. Sokolov, “Searching for text information with the help of vector representations,” Problems of Programming, No. 4, 50–59 (2005).Google Scholar
  122. 122.
    Q. Shi, J. Petterson, G. Dror, J. Langford, A. J. Smola, and S. V. N. Vishwanathan, “Hash kernels for structured data,” J. Mach. Learn. Res., Vol. 10, 2615–2637 (2009).MathSciNetMATHGoogle Scholar
  123. 123.
    D. A. Rachkovskij, S. V. Slipchenko, E. M. Kussul, and T. N. Baidyk, “Sparse binary distributed encoding of scalars,” Journal of Automation and Information Sciences, Vol. 37, No. 6, 12–23 (2005).CrossRefGoogle Scholar
  124. 124.
    D. A. Rachkovskij, S. V. Slipchenko, I. S. Misuno, E. M. Kussul, and T. N. Baidyk, “Sparse binary distributedencoding of numeric vectors,” Journal of Automation and Information Sciences, Vol. 37, No. 11, 47–61 (2005).CrossRefGoogle Scholar
  125. 125.
    D. A. Rachkovskij, S. V. Slipchenko, E. M. Kussul, and T. N. Baidyk, “A binding procedure for distributed binary data representations,” Cybernetics and Systems Analysis, Vol. 41, No. 3, 319–331 (2005).MathSciNetCrossRefMATHGoogle Scholar
  126. 126.
    E. M. Kussul, D. A. Rachkovskij, and D. C. Wunsch, “The random subspace coarse coding scheme for real-valued vectors,” in: Proc. IJCNN’99 (1999), pp. 450–455.Google Scholar
  127. 127.
    D. A. Rachkovskij, S. V. Slipchenko, E. M. Kussul, and T. N. Baidyk, “Properties of numeric codes for the scheme of random subspaces RSC,” Cybernetics and Systems Analysis, Vol. 41, No. 4, 509–520 (2005).MathSciNetCrossRefMATHGoogle Scholar
  128. 128.
    K. Eshghi and M. Kafai, “The CRO Kernel: Using concomitant rank order hashes for sparse high dimensional randomized feature maps,” in: Proc. ICDE’16 (2016), pp. 721–730.Google Scholar
  129. 129.
    K. Forbus, R. Ferguson, A. Lovett, and D. Gentner, “Extending SME to handle large-scale cognitive modeling,” DOI:  10.1111/cogs.12377 (2016).
  130. 130.
    D. A. Rachkovskij and S. V. Slipchenko, “Similarity-based retrieval with structure-sensitive sparse binary distributed representations,” Computational Intelligence, Vol. 28, No. 1, 106–129 (2012).MathSciNetCrossRefGoogle Scholar
  131. 131.
    D. A. Rachkovskij, “Some approaches to analogical mapping with structure sensitive distributed representations,” J. Experimental and Theoretical Artificial Intelligence, Vol. 16, No. 3, 125–145 (2004).CrossRefMATHGoogle Scholar
  132. 132.
    S. V. Slipchenko and D. A. Rachkovskij, “Analogical mapping using similarity of binary distributed representations,” Int. J. Information Theories and Applications, Vol. 16, No. 3, 269–290 (2009).Google Scholar
  133. 133.
    L. Jacques, Small Width, Low Distortions: Quasi-Isometric Embeddings with Quantized Sub-Gaussian Random Projections, arXiv:1504.06170 (2015).Google Scholar
  134. 134.
    L. Jacques and V. Cambareri, Time for Dithering: Fast and Quantized Random Embeddings via the Restricted Isometry Property, arXiv:1607.00816 (2016).Google Scholar
  135. 135.
    P. T. Boufounos, H. Mansour, S. Rane, and A. Vetro, “Dimensionality reduction of visual features for efficient retrieval and classification,” APSIPA Trans. on Signal and Information Processing, Vol. 5, No. e14, 1–14 (2016).Google Scholar
  136. 136.
    P. T. Boufounos, S. Rane, and H. Mansour, Representation and Coding of Signal Geometry, arXiv:1512.07636 (2015).Google Scholar
  137. 137.
    Q. Lv, M. Charikar, and K. Li, “Image similarity search with compact data structures,” in: Proc. CIKM’04 (2004), pp. 208–217.Google Scholar
  138. 138.
    Z. Wang, W. Dong, W. Josephson, Q. Lv, M. Charikar, and K. Li, “Sizing sketches: Rank-based analysis for similarity search,” in: Proc. SIGMETRICS’07 (2007), pp. 157–168.Google Scholar
  139. 139.
    W. Dong, M. Charikar, and K. Li, “Asymmetric distance estimation with sketches for similarity search in high-dimensional spaces,” in: Proc. SIGIR’08 (2008), pp. 123–130.Google Scholar
  140. 140.
    K. Min, L. Yang, J. Wright, L. Wu, X.-S. Hua, and Y. Ma, “Compact projection: Simple and efficient near neighbor search with practical memory requirements,” in: Proc. CVPR’10 (2010), pp. 3477–3484.Google Scholar
  141. 141.
    E. Chávez, G. Navarro, R. Baeza-Yates, and J. L Marroquín, “Searching in metric spaces,” ACM Computing Surveys, Vol. 33, No. 3, 273–321 (2001).CrossRefGoogle Scholar
  142. 142.
    P. Zezula, G. Amato, V. Dohnal, and M. Batko, Similarity Search: The Metric Space Approach, Springer, New York (2006).MATHGoogle Scholar
  143. 143.
    G. R. Hjaltason and H. Samet, “Index-driven similarity search in metric spaces,” ACM Transactions on Database Systems, Vol. 28, No. 4, 517–580 (2003).CrossRefGoogle Scholar
  144. 144.
    A. Becker, L. Ducas, N. Gama, and T. Laarhoven, “New directions in nearest neighbor searching with applications to lattice sieving,” in: Proc. SODA’16 (2016), pp. 10–24.Google Scholar
  145. 145.
    M. Muja and D. G. Lowe, “Scalable nearest neighbor algorithms for high dimensional data,” IEEE Trans. on PAMI, Vol. 36, No. 11, 2227–2240 (2014).CrossRefGoogle Scholar
  146. 146.
    X. Zhang, J. Qin, W. Wang, Y. Sun, and J. Lu, “Hmsearch: An efficient hamming distance query processing algorithm,” in: Proc. SSDBM’13 (2013), pp. 19:1–19:12.Google Scholar
  147. 147.
    M. Norouzi, A. Punjani, and D. J. Fleet, “Fast exact search in Hamming space with multi-index hashing,” IEEE Trans. PAMI, Vol. 36, No. 6, 1107–1119 (2014).CrossRefGoogle Scholar
  148. 148.
    J. Song, H. T. Shen, J. Wang, Z. Huang, N. Sebe, and J. Wang, “A distance-computation-free search scheme for binary code databases,” IEEE Trans. Multimedia, Vol. 18, No. 3, 484–495 (2016).CrossRefGoogle Scholar
  149. 149.
    N. Pham and R. Pagh, “Scalability and total recall with fast CoveringLSH,” in: Proc. CIKM’16 (2016).Google Scholar
  150. 150.
    Z. Jiang, L. Xie, X. Deng, W. Xu, and J. Wang, “Fast nearest neighbor search in the hamming space,” in: Proc. MMM’16 (2016), pp. 325–336.Google Scholar
  151. 151.
    J. Wang, W. Liu, S. Kumar, and S.-F. Chang, “Learning to hash for indexing big data: A survey,” in: Proc. of the IEEE, Vol. 104, No. 1, 34–57 (2016).Google Scholar
  152. 152.
    J. Wang, T. Zhang, J. Song, N. Sebe, and H. T. Shen, A Survey on Learning to Hash, arXiv:1606.00185 (2016).Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.International Scientific-Educational Center of Information Technologies and Systems, NAS and MES of UkraineKyivUkraine

Personalised recommendations