NM-Tree: Flexible Approximate Similarity Search in Metric and Non-metric Spaces

  • Tomáš Skopal
  • Jakub Lokoč
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5181)

Abstract

So far, an efficient similarity search in multimedia databases has been carried out by metric access methods (MAMs), where the utilized similarity measure had to satisfy the metric properties (reflexivity, non-negativity, symmetry, triangle inequality). Recently, the introduction of TriGen algorithm (turning any nonmetric into metric) enabled MAMs to perform also nonmetric similarity search. Moreover, it simultaneously enabled faster approximate search (either metric or nonmetric). However, a simple application of TriGen as the first step before MAMs’ indexing assumes a fixed “approximation level”, that is, a user-defined tolerance of retrieval precision is preset for the whole index lifetime. In this paper, we push the similarity search forward; we propose the NM-tree (nonmetric tree) – a modification of M-tree which natively aggregates the TriGen algorithm to support flexible approximate nonmetric or metric search. Specifically, at query time the NM-tree provides a user-defined level of retrieval efficiency/precision trade-off. We show the NM-tree could be used for general (non)metric search, while the desired retrieval precision can be flexibly tuned on-demand.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ashby, F., Perrin, N.: Toward a unified theory of similarity and recognition. Psychological Review 95(1), 124–150 (1988)CrossRefGoogle Scholar
  2. 2.
    Athitsos, V., Hadjieleftheriou, M., Kollios, G., Sclaroff, S.: Query-sensitive embeddings. In: SIGMOD 2005: Proceedings of the 2005 ACM SIGMOD international conference on Management of data, pp. 706–717. ACM Press, New York (2005)CrossRefGoogle Scholar
  3. 3.
    Chávez, E., Navarro, G.: A Probabilistic Spell for the Curse of Dimensionality. In: Buchsbaum, A.L., Snoeyink, J. (eds.) ALENEX 2001. LNCS, vol. 2153, pp. 147–160. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  4. 4.
    Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Computing Surveys 33(3), 273–321 (2001)CrossRefGoogle Scholar
  5. 5.
    Chen, L., Lian, X.: Efficient similarity search in nonmetric spaces with local constant embedding. IEEE Transactions on Knowledge and Data Engineering 20(3), 321–336 (2008)CrossRefGoogle Scholar
  6. 6.
    Ciaccia, P., Patella, M., Zezula, P.: M-tree: An Efficient Access Method for Similarity Search in Metric Spaces. In: VLDB 1997. LNCS, vol. 1263, pp. 426–435 (1997)Google Scholar
  7. 7.
    Farago, A., Linder, T., Lugosi, G.: Fast nearest-neighbor search in dissimilarity spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence 15(9), 957–962 (1993)CrossRefGoogle Scholar
  8. 8.
    Goh, K.-S., Li, B., Chang, E.: DynDex: a dynamic and non-metric space indexer. In: ACM Multimedia (2002)Google Scholar
  9. 9.
    Hettich, S., Bay, S.: The UCI KDD archive (1999), http://kdd.ics.uci.edu
  10. 10.
    Jacobs, D., Weinshall, D., Gdalyahu, Y.: Classification with nonmetric distances: Image retrieval and class representation. IEEE Pattern Analysis and Machine Intelligence 22(6), 583–600 (2000)CrossRefGoogle Scholar
  11. 11.
    Krumhansl, C.L.: Concerning the applicability of geometric models to similar data: The interrelationship between similarity and spatial density. Psychological Review 85(5), 445–463 (1978)CrossRefGoogle Scholar
  12. 12.
    Kruskal, J.B.: Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29(1), 1–27 (1964)MATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Rosch, E.: Cognitive reference points. Cognitive Psychology 7, 532–547 (1975)CrossRefGoogle Scholar
  14. 14.
    Rothkopf, E.: A measure of stimulus similarity and errors in some paired-associate learning tasks. J. of Experimental Psychology 53(2), 94–101 (1957)CrossRefGoogle Scholar
  15. 15.
    Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, San Francisco (2006)MATHGoogle Scholar
  16. 16.
    Skopal, T.: On fast non-metric similarity search by metric access methods. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 718–736. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  17. 17.
    Skopal, T.: Unified framework for fast exact and approximate search in dissimilarity spaces. ACM Transactions on Database Systems 32(4), 1–46 (2007)CrossRefGoogle Scholar
  18. 18.
    Skopal, T., Pokorný, J., Krátký, M., Snášel, V.: Revisiting M-tree Building Principles. In: Kalinichenko, L.A., Manthey, R., Thalheim, B., Wloka, U. (eds.) ADBIS 2003. LNCS, vol. 2798, pp. 148–162. Springer, Heidelberg (2003)Google Scholar
  19. 19.
    Tversky, A.: Features of similarity. Psychological review 84(4), 327–352 (1977)CrossRefGoogle Scholar
  20. 20.
    Tversky, A., Gati, I.: Similarity, separability, and the triangle inequality. Psychological Review 89(2), 123–154 (1982)CrossRefGoogle Scholar
  21. 21.
    Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach (Advances in Database Systems). Springer, Secaucus (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Tomáš Skopal
    • 1
  • Jakub Lokoč
    • 1
  1. 1.Department of Software EngineeringCharles University in Prague, FMPPragueCzech Republic

Personalised recommendations