Skip to main content

Quick and Easy Implementation of Approximate Similarity Search with Lucene

  • Conference paper
Digital Libraries and Archives (IRCDL 2012)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 354))

Included in the following conference series:

Abstract

Similarity search technique has been proved to be an effective way for retrieving multimedia content. However, as the amount of available multimedia data increases, the cost of developing from scratch a robust and scalable system with content-based image retrieval facilities is quite prohibitive.

In this paper, we propose to exploit an approach that allows us to convert low level features into a textual form. In this way, we are able to easily set up a retrieval system on top of the Lucene search engine library that combines full-text search with approximate similarity search capabilities.

This work was partially supported by the ASSETS project funded by the European Commission.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amato, G., Savino, P.: Approximate similarity search in metric spaces using inverted files. In: Proceedings of the 3rd International Conference on Scalable Information Systems (InfoScale 2008), pp. 1–10. ICST (2008)

    Google Scholar 

  2. Batko, M., Kohoutkova, P., Novak, D.: Cophir image collection under the microscope. In: International Workshop on Similarity Search and Applications, pp. 47–54 (2009)

    Google Scholar 

  3. Bolettieri, P., Esuli, A., Falchi, F., Lucchese, C., Perego, R., Rabitti, F.: Enabling content-based image retrieval in very large digital libraries. In: Second Workshop on Very Large Digital Libraries (VLDL 2009), pp. 43–50. DELOS (2009)

    Google Scholar 

  4. Chavez, E., Figueroa, K., Navarro, G.: Effective proximity retrieval by ordering permutations. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 1647–1658 (2007)

    Google Scholar 

  5. Esuli, A.: Pp-index: Using permutation prefixes for efficient and scalable approximate similarity search. In: Proceedings of the 7th Workshop on Large-Scale Distributed Systems for Information Retrieval (LSDS-IR 2009), pp. 17–24 (2009)

    Google Scholar 

  6. Esuli, A.: Use of permutation prefixes for efficient and scalable approximate similarity search. Information Processing & Management (2011)

    Google Scholar 

  7. Fagin, R., Kumar, R., Sivakumar, D.: Comparing top-k lists. SIAM J. of Discrete Math. 17(1), 134–160 (2003)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Amato, G., Bolettieri, P., Gennaro, C., Rabitti, F. (2013). Quick and Easy Implementation of Approximate Similarity Search with Lucene. In: Agosti, M., Esposito, F., Ferilli, S., Ferro, N. (eds) Digital Libraries and Archives. IRCDL 2012. Communications in Computer and Information Science, vol 354. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35834-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35834-0_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35833-3

  • Online ISBN: 978-3-642-35834-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics