Advertisement

Re-ranking Permutation-Based Candidate Sets with the n-Simplex Projection

  • Giuseppe Amato
  • Edgar Chávez
  • Richard Connor
  • Fabrizio Falchi
  • Claudio Gennaro
  • Lucia VadicamoEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11223)

Abstract

In the realm of metric search, the permutation-based approaches have shown very good performance in indexing and supporting approximate search on large databases. These methods embed the metric objects into a permutation space where candidate results to a given query can be efficiently identified. Typically, to achieve high effectiveness, the permutation-based result set is refined by directly comparing each candidate object to the query one. Therefore, one drawback of these approaches is that the original dataset needs to be stored and then accessed during the refining step. We propose a refining approach based on a metric embedding, called n-Simplex projection, that can be used on metric spaces meeting the n-point property. The n-Simplex projection provides upper- and lower-bounds of the actual distance, derived using the distances between the data objects and a finite set of pivots. We propose to reuse the distances computed for building the data permutations to derive these bounds and we show how to use them to improve the permutation-based results. Our approach is particularly advantageous for all the cases in which the traditional refining step is too costly, e.g. very large dataset or very expensive metric function.

Keywords

Metric search Permutation-based indexing n-point property n-Simplex projection Metric embedding Distance bounds 

Notes

Acknowledgements

The work was partially funded by Smart News, “Social sensing for breaking news”, CUP CIPE D58C15000270008, and by VISECH, ARCO-CNR, CUP B56J17001330004.

References

  1. 1.
    Amato, G., Falchi, F., Gennaro, C., Rabitti, F.: YFCC100M-HNfc6: a large-scale deep features benchmark for similarity search. In: Amsaleg, L., Houle, M.E., Schubert, E. (eds.) SISAP 2016. LNCS, vol. 9939, pp. 196–209. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46759-7_15CrossRefGoogle Scholar
  2. 2.
    Amato, G., Falchi, F., Gennaro, C., Vadicamo, L.: Deep permutations: deep convolutional neural networks and permutation-based indexing. In: Amsaleg, L., Houle, M.E., Schubert, E. (eds.) SISAP 2016. LNCS, vol. 9939, pp. 93–106. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46759-7_7CrossRefGoogle Scholar
  3. 3.
    Amato, G., Falchi, F., Rabitti, F., Vadicamo, L.: Some theoretical and experimental observations on permutation spaces and similarity search. In: Traina, A.J.M., Traina, C., Cordeiro, R.L.F. (eds.) SISAP 2014. LNCS, vol. 8821, pp. 37–49. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11988-5_4CrossRefGoogle Scholar
  4. 4.
    Amato, G., Gennaro, C., Savino, P.: MI-File: using inverted files for scalable approximate similarity search. Multimed. Tools Appl. 71(3), 1333–1362 (2014)CrossRefGoogle Scholar
  5. 5.
    Amato, G., Savino, P.: Approximate similarity search in metric spaces using inverted files. In: Proceedings of InfoScale 2008, pp. 28:1–28:10. ICST (2008)Google Scholar
  6. 6.
    Babenko, A., Lempitsky, V.: The inverted multi-index. In: Proceedings of CVPR 2012, pp. 3069–3076. IEEE (2012)Google Scholar
  7. 7.
    Blumenthal, L.M.: Theory and Applications of Distance Geometry. Clarendon Press, Oxford (1953)zbMATHGoogle Scholar
  8. 8.
    Chávez, E., Figueroa, K., Navarro, G.: Effective proximity retrieval by ordering permutations. IEEE Trans. Pattern Anal. Mach. Intell. 30(9), 1647–1658 (2008)CrossRefGoogle Scholar
  9. 9.
    Connor, R., Cardillo, F.A., Vadicamo, L., Rabitti, F.: Hilbert exclusion: improved metric search through finite isometric embeddings. ACM Trans. Inf. Syst. 35(3), 17:1–17:27 (2016)CrossRefGoogle Scholar
  10. 10.
    Connor, R., Vadicamo, L., Cardillo, F.A., Rabitti, F.: Supermetric search with the four-point property. In: Amsaleg, L., Houle, M.E., Schubert, E. (eds.) SISAP 2016. LNCS, vol. 9939, pp. 51–64. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46759-7_4CrossRefGoogle Scholar
  11. 11.
    Connor, R., Vadicamo, L., Cardillo, F.A., Rabitti, F.: Supermetric search. Inf. Syst. (2018).  https://doi.org/10.1016/j.is.2018.01.002. https://www.sciencedirect.com/science/article/pii/S0306437917301588
  12. 12.
    Connor, R., Vadicamo, L., Rabitti, F.: High-dimensional simplexes for supermetric search. In: Beecks, C., Borutta, F., Kröger, P., Seidl, T. (eds.) SISAP 2017. LNCS, vol. 10609, pp. 96–106. Springer, Cham (2007).  https://doi.org/10.1007/978-3-319-68474-1_7CrossRefGoogle Scholar
  13. 13.
    Esuli, A.: Use of permutation prefixes for efficient and scalable approximate similarity search. Inf. Process. Manag. 48(5), 889–902 (2012)CrossRefGoogle Scholar
  14. 14.
    Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: Proceedings of SODA 2003, pp. 28–36. Society for Industrial and Applied Mathematics (2003)Google Scholar
  15. 15.
    Figueroa, K., Navarro, G., Chávez, E.: Metric spaces library (2007). www.sisap.org/library/manual.pdf
  16. 16.
    Micó, M.L., Oncina, J., Vidal, E.: A new version of the nearest-neighbour approximating and eliminating search algorithm (AESA) with linear preprocessing time and memory requirements. Pattern Recogn. Lett. 15(1), 9–17 (1994)CrossRefGoogle Scholar
  17. 17.
    Novak, D., Zezula, P.: PPP-codes for large-scale similarity searching. In: Hameurlain, A., Küng, J., Wagner, R., Decker, H., Lhotska, L., Link, S. (eds.) Transactions on Large-Scale Data- and Knowledge-Centered Systems XXIV. LNCS, vol. 9510, pp. 61–87. Springer, Heidelberg (2016).  https://doi.org/10.1007/978-3-662-49214-7_2CrossRefGoogle Scholar
  18. 18.
    Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. Proc. EMNLP 2014, 1532–1543 (2014)Google Scholar
  19. 19.
    Pestov, V.: Indexability, concentration, and VC theory. J. Discret. Algorithms 13, 2–18 (2012)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Schoenberg, I.J.: Metric spaces and completely monotone functions. Ann. Math. 39(4), 811–841 (1938)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Thomee, B., et al.: YFCC100M: the new data in multimedia research. Commun. ACM 59(2), 64–73 (2016)CrossRefGoogle Scholar
  22. 22.
    Weber, R., Schek, H.-J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. VLDB 98, 194–205 (1998)Google Scholar
  23. 23.
    Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach. Advances in Database Systems, vol. 32. Springer, Boston (2006).  https://doi.org/10.1007/0-387-29151-2CrossRefzbMATHGoogle Scholar
  24. 24.
    Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems 27, pp. 487–495. Curran Associates Inc. (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Giuseppe Amato
    • 1
  • Edgar Chávez
    • 2
  • Richard Connor
    • 3
  • Fabrizio Falchi
    • 1
  • Claudio Gennaro
    • 1
  • Lucia Vadicamo
    • 1
    Email author
  1. 1.Institute of Information Science and Technologies (ISTI), CNRPisaItaly
  2. 2.Department of Computer ScienceCICESEEnsenadaMexico
  3. 3.Department of Computing ScienceUniversity of StirlingStirlingScotland

Personalised recommendations