Advertisement

Optimizing the Distance Computation Order of Multi-Feature Similarity Search Indexing

  • Marcel ZierenbergEmail author
  • Ingo Schmitt
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9371)

Abstract

Multi-feature search is an effective approach to similarity search. Unfortunately, the search efficiency decreases with the number of features. Several indexing approaches aim to achieve efficiency by incrementally reducing the approximation error of aggregated distance bounds. They apply heuristics to determine the distance computations order and update the object’s aggregated bounds after each computation. However, the existing indexing approaches suffer from several drawbacks. They use the same computation order for all objects, do not support important types of aggregation functions and do not take the varying CPU and I/O costs of different distance computations into account. To resolve these problems, we introduce a new heuristic to determine an efficient distance computation order for each individual object. Our heuristic supports various important aggregation functions and calculates cost-benefit-ratios to incorporate the varying computation costs of different distance functions. The experimental evaluation reveals that our heuristic outperforms state-of-the-art approaches in terms of the number of distance computations as well as search time.

Keywords

Distance Function Partial Distance Search Time Distance Computation Aggregation Function 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach. Advances in Database Systems, vol. 32, pp. 1–191. Springer-Verlag New York Inc., Secaucus (2006)zbMATHGoogle Scholar
  2. 2.
    Böhm, K., Mlivoncic, M., Schek, H.-J., Weber, R.: Fast evaluation techniques for complex similarity queries. In: Proc. of the 27th International Conference on Very Large Data Bases, VLDB 2001, pp. 211–220. Morgan Kaufmann Publishers Inc., San Francisco (2001)Google Scholar
  3. 3.
    Zierenberg, M., Bertram, M.: FlexiDex: flexible indexing for similarity search with logic-based query models. In: Catania, B., Guerrini, G., Pokorný, J. (eds.) ADBIS 2013. LNCS, vol. 8133, pp. 274–287. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  4. 4.
    Güntzer, U., Balke, W.-T., Kießling, W.: Optimizing multi-feature queries for image databases. In: Proc. of the 26th International Conference on Very Large Data Bases, VLDB 2000, pp. 419–428. Morgan Kaufmann Publishers Inc., San Francisco (2000)Google Scholar
  5. 5.
    Jagadish, H.V., Ooi, B.C., Shen, H.T., Tan, K.-L.: Toward Efficient Multifeature Query Processing. IEEE Trans. on Knowl. and Data Eng. 18, 350–362 (2006)CrossRefGoogle Scholar
  6. 6.
    Zierenberg, M.: Partial refinement for similarity search with multiple features. In: Traina, A.J.M., Traina Jr, C., Cordeiro, R.L.F. (eds.) SISAP 2014. LNCS, vol. 8821, pp. 13–24. Springer, Heidelberg (2014) Google Scholar
  7. 7.
    Carélo, C.C.M., Pola, I.R.V., Ciferri, R.R., Traina, A.J.M., Traina Jr, C., de Aguiar Ciferri, C.D.: Slicing the Metric Space to Provide Quick Indexing of Complex Data in the Main Memory. Inf. Syst. 36(1), 79–98 (2011)CrossRefGoogle Scholar
  8. 8.
    Griffin, G., Holub, A., Perona, P.: Caltech-256 Object Category Dataset. Tech. rep. 7694. California Institute of Technology (2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Institute of Computer Science, Information and Media Technology, Chair of Database and Information SystemsBrandenburg University of Technology Cottbus - SenftenbergCottbusGermany

Personalised recommendations