Efficient Similarity Search by Combining Indexing and Caching Strategies
A critical issue in large scale search engines is to efficiently handle sudden peaks of incoming query traffic. Research in metric spaces has addressed this problem from the point of view of creating caches that provide information to, if possible, exactly/approximately answer a query very quickly without needing to further process an index. However, one of the problems of that approach is that, if the cache is not able to provide an answer, the distances computed up to that moment are wasted, and the search must proceed through the index structure. In this paper we present an index structure that serves a twofold role: that of a cache and an index in the same structure. In this way, if we are not able to provide a quick approximate answer for the query, the distances computed up to that moment are used to query the index. We present an experimental evaluation of the performance obtained with our structure.
KeywordsCluster Center Index Structure Distance Computation Range Query Range Search
Unable to display preview. Download preview PDF.
- 1.Falchi, F., Lucchese, C., Orlando, S., Perego, R., Rabitti, F.: Caching content-based queries for robust and efficient image retrieval. In: Procs. of EDBT, pp. 780–790 (2009)Google Scholar
- 2.Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: Procs. of VLDB, pp. 426–435 (1997)Google Scholar
- 4.Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity search. The metric space approach. Advances in Database Systems, vol. 32. Springer (2006)Google Scholar
- 8.Bustos, B., Pedreira, O., Brisaboa, N.: A dynamic pivot selection technique for similarity search in metric spaces. In: Procs. of SISAP, pp. 105–112. IEEE Press (2008)Google Scholar
- 9.Ares, L.G., Brisaboa, N.R., Esteller, M.F., Pedreira, O., Ángeles, S.: Places: Optimal pivots to minimize the index size for metric access methods. In: Procs. of SISAP, pp. 74–80. IEEE Press (2009)Google Scholar
- 10.Falchi, F., Lucchese, C., Orlando, S., Perego, R., Rabitti, F.: A metric cache for similarity search. In: Procs. of LSDS-IR, pp. 43–50 (2008)Google Scholar
- 11.Falchi, F., Lucchese, C., Orlando, S., Perego, R., Rabitti, F.: Similarity caching in large-scale image retrieval. Information Processing and Management (2011)Google Scholar
- 12.Skopal, T., Lokoc, J., Bustos, B.: D-cache: Universal distance cache for metric access methods. Transactions on Knowledge and Data Engineering 99 (2011)Google Scholar
- 15.Figueroa, K., Navarro, G., Chávez, E.: Metric spaces library (2007), http://www.sisap.org/Metric_Space_Library.html