Advertisement

New Permutation Dissimilarity Measures for Proximity Searching

  • Karina FigueroaEmail author
  • Rodrigo Paredes
  • Nora Reyes
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11223)

Abstract

Proximity searching consists in retrieving the most similar objects to a given query from a database. To do so, the usual approach consists in using an index in order to improve the response time of online queries. Recently, the permutation based algorithms (PBA) were presented, and from then on, this technique has been very successful. In its core, the PBA uses a metric between permutations, typically Spearman Footrule or Spearman Rho. Until now, several proposals based on the PBA have been developed and all of them uses one of those metrics. In this paper, we present a new family of dissimilarity measures between permutations. According to our experimental evaluation, we can reduce up to 30% the original technique costs, while preserving its exceptional answer quality. Since our dissimilarity measures can be applied in any state-of-the-art PBA variant, the impact of our proposal is significant for the similarity search community.

Keywords

Approximate similarity searching Permutation based algorithms Permutation dissimilarity measures 

References

  1. 1.
    Amato, G., Esuli, A., Falchi, F.: A comparison of pivot selection techniques for permutation-based indexing. Inf. Syst. 52, 176–188 (2015).  https://doi.org/10.1016/j.is.2015.01.010CrossRefGoogle Scholar
  2. 2.
    Amato, G., Savino, P.: Approximate similarity search in metric spaces using inverted files. In: 3rd International ICST Conference on Scalable Information Systems, INFOSCALE 2008, Vico Equense, Italy, 4–6 June 2008. p. 28. ICST/ACM (2008).  https://doi.org/10.4108/ICST.INFOSCALE2008.3486
  3. 3.
    Chávez, E., Figueroa, K., Navarro, G.: Proximity searching in high dimensional spaces with a proximity preserving order. In: Gelbukh, A., de Albornoz, Á., Terashima-Marín, H. (eds.) MICAI 2005. LNCS (LNAI), vol. 3789, pp. 405–414. Springer, Heidelberg (2005).  https://doi.org/10.1007/11579427_41CrossRefGoogle Scholar
  4. 4.
    Chávez, E., Figueroa, K., Navarro, G.: Effective proximity retrieval by ordering permutations. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 30(9), 1647–1658 (2009)Google Scholar
  5. 5.
    Chávez, E., Navarro, G.: A probabilistic spell for the curse of dimensionality. In: Buchsbaum, A.L., Snoeyink, J. (eds.) ALENEX 2001. LNCS, vol. 2153, pp. 147–160. Springer, Heidelberg (2001).  https://doi.org/10.1007/3-540-44808-X_12CrossRefGoogle Scholar
  6. 6.
    Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.: Proximity searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)CrossRefGoogle Scholar
  7. 7.
    Diaconis, P., Graham, R.L.: Spearman’s footrule as a measure of disarray. J. R. Stat. Soc. Ser. B (Methodol.) 39(2), 262–268 (1977)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Esuli, A.: Use of permutation prefixes for efficient and scalable approximate similarity search. Inf. Process. Manage. 48(5), 889–902 (2012).  https://doi.org/10.1016/j.ipm.2010.11.011CrossRefGoogle Scholar
  9. 9.
    Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. SIAM J. Discrete Math. 17(1), 134–160 (2003)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Figueroa, K., Paredes, R.: List of clustered permutations for proximity searching. In: Brisaboa, N., Pedreira, O., Zezula, P. (eds.) SISAP 2013. LNCS, vol. 8199, pp. 50–58. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-41062-8_6CrossRefGoogle Scholar
  11. 11.
    Figueroa, K., Navarro, G., Chávez, E.: Metric spaces library (2007). http://www.sisap.org/Metric_Space_Library.html
  12. 12.
    Figueroa, K., Paredes, R., Camarena-Ibarrola, J.A., Reyes, N.: Fixed height queries tree permutation index for proximity searching. In: Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Olvera-López, J.A. (eds.) MCPR 2017. LNCS, vol. 10267, pp. 74–83. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-59226-8_8CrossRefGoogle Scholar
  13. 13.
    Hjaltason, G., Samet, H.: Index-driven similarity search in metric spaces. ACM Trans. Database Syst. 28(4), 517–580 (2003).  https://doi.org/10.1145/958942.958948CrossRefGoogle Scholar
  14. 14.
    Mohamed, H., Marchand-Maillet, S.: Quantized ranking for permutation-based indexing. Inf. Syst. 52, 163–175 (2015).  https://doi.org/10.1016/j.is.2015.01.009CrossRefGoogle Scholar
  15. 15.
    Naidan, B., Boytsov, L., Nyberg, E.: Permutation search methods are efficient, yet faster search is possible. Proc. VLDB Endow. 8(12), 1618–1629 (2015).  https://doi.org/10.14778/2824032.2824059CrossRefGoogle Scholar
  16. 16.
    Patella, M., Ciaccia, P.: Approximate similarity search: a multi-faceted problem. J. Discret Algorithms 7(1), 36–48 (2009).  https://doi.org/10.1016/j.jda.2008.09.014. Selected papers from the 1st International Workshop on Similarity Search and Applications (SISAP)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Samet, H.: Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling). Morgan Kaufmann Publishers Inc., San Francisco (2005)Google Scholar
  18. 18.
    Skopal, T.: On fast non-metric similarity search by metric access methods. In: Ioannidi, Y., et al. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 718–736. Springer, Heidelberg (2006).  https://doi.org/10.1007/11687238_43CrossRefGoogle Scholar
  19. 19.
    Tellez, E.S., Chavez, E., Navarro, G.: Succint nearest neighbor search. Inf. Syst. 38(7), 1019–1030 (2013)CrossRefGoogle Scholar
  20. 20.
    Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach, Advances in Database Systems, vol. 32. Springer, Heidelberg (2006).  https://doi.org/10.1007/0-387-29151-2CrossRefzbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Universidad MichoacanaMoreliaMéxico
  2. 2.Universidad de TalcaCuricóChile
  3. 3.Universidad Nacional de San LuisSan LuisArgentina

Personalised recommendations