Skip to main content

Shortening the Candidate List for Similarity Searching Using Inverted Index

  • Conference paper
  • First Online:
Pattern Recognition (MCPR 2021)

Abstract

Similarity searching consists of retrieving elements from a database that are closest to a given query. One strategy selects some elements as references and uses them to organize the whole database. With these reference points, it is possible to obtain a candidate list that contains the answer to the query and the rest of the database can be discarded for a while. In this article, a new strategy for reducing the candidate list is proposed. According to the experiments presented, it is possible to reduce the size of the list by up to 35%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amato, G., Savino, P.: Approximate similarity search in metric spaces using inverted files. In: Lempel, R., Perego, R., Silvestri, F. (eds.) 3rd International ICST Conference on Scalable Information Systems, INFOSCALE 2008, Vico Equense, Italy, 4–6 June 2008, p. 28. ICST/ACM (2008). https://doi.org/10.4108/ICST.INFOSCALE2008.3486

  2. Chávez, E., Figueroa, K., Navarro, G.: Proximity searching in high dimensional spaces with a proximity preserving order. In: Gelbukh, A., de Albornoz, Á., Terashima-Marín, H. (eds.) MICAI 2005. LNCS (LNAI), vol. 3789, pp. 405–414. Springer, Heidelberg (2005). https://doi.org/10.1007/11579427_41

    Chapter  Google Scholar 

  3. Chávez, E., Navarro, G.: A compact space decomposition for effective metric indexing. Pattern Recogn. Lett. 26(9), 1363–1376 (2005)

    Article  Google Scholar 

  4. Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.: Proximity searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)

    Article  Google Scholar 

  5. Chávez, E., Figueroa, K., Navarro, G.: Effective proximity retrieval by ordering permutations. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 30, 1647–1658 (2008). http://doi.ieeecomputersociety.org/10.1109/TPAMI.2007.70815

  6. Esuli, A.: PP-Index: using permutation prefixes for efficient and scalable approximate similarity search. In: LSDR-IR Workshop (2019)

    Google Scholar 

  7. Figueroa, K., Navarro, G., Chávez, E.: Metric spaces library (2007). http://www.sisap.org/Metric_Space_Library.html

  8. Figueroa, K., Reyes, N., Camarena-Ibarrola, A.: Candidate list obtained from metric inverted index for similarity searching. In: Martínez-Villaseñor, L., Herrera-Alcántara, O., Ponce, H., Castro-Espinoza, F.A. (eds.) MICAI 2020. LNCS (LNAI), vol. 12469, pp. 29–38. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60887-3_3

    Chapter  Google Scholar 

  9. Mohamed, H., Marchand-Maillet, S.: Quantized ranking for permutation-based indexing. Inf. Syst. 52, 163–175 (2015). https://doi.org/10.1016/j.is.2015.01.009

    Article  Google Scholar 

  10. Patella, M., Ciaccia, P.: Approximate similarity search: a multi-faceted problem. J. Discrete Algorithms 7(1), 36–48 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  11. Samet, H.: Foundations of Multidimensional and Metric Data Structures. The Morgan Kaufman Series in Computer Graphics and Geometic Modeling, 1st edn. Morgan Kaufmann Publishers, University of Maryland at College Park (2006)

    Google Scholar 

  12. Skala, M.: Counting distance permutations. J. Discrete Algorithms 7(1), 49–61 (2009). https://doi.org/10.1016/j.jda.2008.09.011

  13. Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search - The Metric Space Approach. Advances in Database Systems. Springer, Boston (2006). https://doi.org/10.1007/0-387-29151-2

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karina Figueroa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Figueroa, K., Camarena-Ibarrola, A., Reyes, N. (2021). Shortening the Candidate List for Similarity Searching Using Inverted Index. In: Roman-Rangel, E., Kuri-Morales, Á.F., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A., Olvera-López, J.A. (eds) Pattern Recognition. MCPR 2021. Lecture Notes in Computer Science(), vol 12725. Springer, Cham. https://doi.org/10.1007/978-3-030-77004-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-77004-4_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-77003-7

  • Online ISBN: 978-3-030-77004-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics