Advertisement

Efficient Indexing of High Dimensional Normalized Histograms

  • Alexandru Coman
  • Jörg Sander
  • Mario A. Nascimento
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2736)

Abstract

This paper addresses the problem of indexing high dimensional normalized histogram data, i.e., D-dimensional feature vectors H where \({\sum^D_{i=1}} H_i = 1\). These are often used as representations for multimedia objects in order to facilitate similarity query processing. By analyzing properties that are induced by the above constraint and that do not hold in general multi-dimensional spaces we design a new split policy. We show that the performance of similarity queries for normalized histogram data can be significantly improved by exploiting such properties within a simple indexing framework. We are able to process nearest-neighbor queries up to 10 times faster than the SR-tree and 3 times faster than the A-tree.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ankerst, M., et al.: 3d shape histograms for similarity search and classification in spatial databases. In: Proc. of the Intl. Symp. on Advances in Spatial Databases, pp. 207–226 (1999)Google Scholar
  2. 2.
    Berchtold, S., Böhm, C., Kriegel, H.-P.: Improving the query performance of high-dimensional index structures by bulk load operations. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 216–230. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  3. 3.
    Berchtold, S., Keim, D.A., Kriegel, H.-P.: The X-tree: An index structure for high-dimensional data. In: Proc. of the Intl. Conf. on Very Large Databases, pp. 28–39 (1996)Google Scholar
  4. 4.
    Beyer, K.S., et al.: When is nearest neighbor meaningful? In: Proc. of the 7th Intl. Conf. on Database Theory, pp. 217–235 (1999)Google Scholar
  5. 5.
    Böhm, C., Berchtold, S., Keim, D.A.: Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Computer Surveys 33(3), 322–373 (2001)CrossRefGoogle Scholar
  6. 6.
    Del Bimbo, A.: Visual Information Retrieval. Morgan Kaufmann Publishers, Inc., San Francisco (1999)Google Scholar
  7. 7.
    Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data, pp. 47–57 (1984)Google Scholar
  8. 8.
    Katayama, N., Satoh, S.: The SR-tree: An index structure for high-dimensional nearest neighbor queries. In: Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pp. 369–380 (1997)Google Scholar
  9. 9.
    Roussopoulos, N., Kelley, S., Vincent, F.: Nearest neighbor queries. In: Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pp. 71–79 (1995)Google Scholar
  10. 10.
    Sakurai, Y., et al.: The A-tree: An index structure for high-dimensional spaces using relative approximation. In: Proc. of the Intl. Conf. on Very Large Data Bases, pp. 516–526 (2000)Google Scholar
  11. 11.
    Weber, R., Schek, H., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proc. of the Intl. Conf. on Very Large Databases, pp. 194–205 (1998)Google Scholar
  12. 12.
    White, D.A., Jain, R.: Similarity indexing with the SS-tree. In: Proc. of the IEEE Intl. Conf. on Data Engineering, pp. 516–523 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Alexandru Coman
    • 1
  • Jörg Sander
    • 1
  • Mario A. Nascimento
    • 1
  1. 1.Department of Computing ScienceUniversity of AlbertaCanada

Personalised recommendations