Advertisement

Metric Embedding into the Hamming Space with the n-Simplex Projection

  • Lucia VadicamoEmail author
  • Vladimir Mic
  • Fabrizio Falchi
  • Pavel Zezula
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11807)

Abstract

Transformations of data objects into the Hamming space are often exploited to speed-up the similarity search in metric spaces. Techniques applicable in generic metric spaces require expensive learning, e.g., selection of pivoting objects. However, when searching in common Euclidean space, the best performance is usually achieved by transformations specifically designed for this space. We propose a novel transformation technique that provides a good trade-off between the applicability and the quality of the space approximation. It uses the n-Simplex projection to transform metric objects into a low-dimensional Euclidean space, and then transform this space to the Hamming space. We compare our approach theoretically and experimentally with several techniques of the metric embedding into the Hamming space. We focus on the applicability, learning cost, and the quality of search space approximation.

Keywords

Sketch Metric search Metric embedding n-point property 

Notes

Acknowledgements

The work was partially supported by VISECH ARCO-CNR, CUP B56J17001330004, and AI4EU project, funded by the EC (H2020 - Contract n. 825619). This research was supported by ERDF “CyberSecurity, CyberCrime and Critical Information Infrastructures Center of Excellence” (No. CZ.02.1.01/0.0/0.0/ 16_019/0000822).

References

  1. 1.
    Amato, G., Gennaro, C., Savino, P.: MI-File: using inverted files for scalable approximate similarity search. Multimed. Tools Appl. 71(3), 1333–1362 (2014)CrossRefGoogle Scholar
  2. 2.
    Beecks, C., Uysal, M.S., Seidl, T.: Signature quadratic form distance. In: Proceedings of the ACM-CIVR 2010, pp. 438–445. ACM (2010)Google Scholar
  3. 3.
    Blumenthal, L.M.: Theory and Applications of Distance Geometry. Clarendon Press, Oxford (1953)zbMATHGoogle Scholar
  4. 4.
    Cao, Y., et al.: Binary hashing for approximate nearest neighbor search on big data: a survey. IEEE Access 6, 2039–2054 (2018)CrossRefGoogle Scholar
  5. 5.
    Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: Proceedings of ACM-STOC 2002. ACM (2002)Google Scholar
  6. 6.
    Chávez, E., Figueroa, K., Navarro, G.: Effective proximity retrieval by ordering permutations. IEEE Trans. Pattern Anal. Mach. Intell. 30(9), 1647–1658 (2008)CrossRefGoogle Scholar
  7. 7.
    Connor, R., Cardillo, F.A., Vadicamo, L., Rabitti, F.: Hilbert exclusion: improved metric search through finite isometric embeddings. ACM Trans. Inf. Syst. 35(3), 17:1–17:27 (2016)CrossRefGoogle Scholar
  8. 8.
    Connor, R., Vadicamo, L., Cardillo, F.A., Rabitti, F.: Supermetric search. Inf. Syst. 80, 108–123 (2018)CrossRefGoogle Scholar
  9. 9.
    Connor, R., Vadicamo, L., Rabitti, F.: High-dimensional simplexes for supermetric search. In: Beecks, C., Borutta, F., Kröger, P., Seidl, T. (eds.) SISAP 2017. LNCS, vol. 10609, pp. 96–109. Springer, Heidelberg (2017).  https://doi.org/10.1007/978-3-319-68474-1_7CrossRefGoogle Scholar
  10. 10.
    Donahue, J., et al.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: Proceedings of ICML 2014, vol. 32, pp. 647–655 (2014)Google Scholar
  11. 11.
    Douze, M., Jégou, H., Perronnin, F.: Polysemous codes. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 785–801. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_48CrossRefGoogle Scholar
  12. 12.
    Gong, Y., Lazebnik, S., Gordo, A., Perronnin, F.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2916–2929 (2013)CrossRefGoogle Scholar
  13. 13.
    Gordo, A., Perronnin, F., Gong, Y., Lazebnik, S.: Asymmetric distances for binary embeddings. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 33–47 (2014)CrossRefGoogle Scholar
  14. 14.
    Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of ACM STOC, pp. 604–613 (1998)Google Scholar
  15. 15.
    Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: Proceedings of CVPR 2010, pp. 3304–3311. IEEE (2010)Google Scholar
  16. 16.
    Kruskal, J.B.: Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29(1), 1–27 (1964)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Mic, V., Novak, D., Vadicamo, L., Zezula, P.: Selecting sketches for similarity search. In: Proceedings of ADBIS, pp. 127–141 (2018)CrossRefGoogle Scholar
  18. 18.
    Mic, V., Novak, D., Zezula, P.: Designing sketches for similarity filtering. In: Proceedings of IEEE ICDM Workshops, pp. 655–662 (2016)Google Scholar
  19. 19.
    Mic, V., Novak, D., Zezula, P.: Binary sketches for secondary filtering. ACM Trans. Inf. Syst. 37(1), 1:1–1:28 (2018)CrossRefGoogle Scholar
  20. 20.
    Novak, D., Zezula, P.: PPP-codes for large-scale similarity searching. In: Hameurlain, A., Küng, J., Wagner, R., Decker, H., Lhotska, L., Link, S. (eds.) Transactions on Large-Scale Data- and Knowledge-Centered Systems XXIV. LNCS, vol. 9510, pp. 61–87. Springer, Heidelberg (2016).  https://doi.org/10.1007/978-3-662-49214-7_2CrossRefGoogle Scholar
  21. 21.
    Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach, vol. 32. Springer, New York (2006).  https://doi.org/10.1007/0-387-29151-2CrossRefzbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Lucia Vadicamo
    • 1
    Email author
  • Vladimir Mic
    • 2
  • Fabrizio Falchi
    • 1
  • Pavel Zezula
    • 2
  1. 1.Institute of Information Science and Technologies (ISTI)CNRPisaItaly
  2. 2.Masaryk UniversityBrnoCzech Republic

Personalised recommendations