Knowledge and Information Systems

, Volume 10, Issue 2, pp 211–227 | Cite as

Extending metric index structures for efficient range query processing

  • Karin Kailing
  • Hans-Peter Kriegel
  • Martin Pfeifle
  • Stefan Schönauer
Short Paper

Abstract

Databases are getting more and more important for storing complex objects from scientific, engineering, or multimedia applications. Examples for such data are chemical compounds, CAD drawings, or XML data. The efficient search for similar objects in such databases is a key feature. However, the general problem of many similarity measures for complex objects is their computational complexity, which makes them unusable for large databases. In this paper, we combine and extend the two techniques of metric index structures and multi-step query processing to improve the performance of range query processing. The efficiency of our methods is demonstrated in extensive experiments on real-world data including graphs, trees, and vector sets.

Keywords

Complex objects Metric indexing Multi-step query processing Density-based clustering 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal R, Faloutsos C, Swami AN (1993) Efficient similarity search in sequence databases. In: Proceedings of the 4th international conference of foundations of data organization and algorithms (FODO), pp 69–84Google Scholar
  2. 2.
    Ankerst M, Breunig MM, Kriegel H-P, Sander J (1999) OPTICS: ordering points to identify the clustering structure. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD'99), Philadelphia, PA, pp 49–60Google Scholar
  3. 3.
    Brecheisen S, Kriegel H-P, Kröger P, Pfeifle M (2004) Visually mining through cluster hierarchies. In: Proceedings of the SIAM international conference on data mining (SDM'04), Orlando, FLGoogle Scholar
  4. 4.
    Chavez E, Navarro G, Baeza-Yates R, Marroquin JL (2001) Searching in metric spaces. ACM Comput Surv 33(3):273–321CrossRefGoogle Scholar
  5. 5.
    Ciaccia P, Patella M, Zezula P (1997) M-tree: An efficient access method for similarity search in metric spaces. In: VLDB'97, Proceedings of the 23rd international conference on very large databases, August 25–29, 1997, Athens, Greece, pp 426–435Google Scholar
  6. 6.
    Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining (KDD'96), Portland, OR, pp 291–316Google Scholar
  7. 7.
    Kailing K, Kriegel H-P, Pfeifle M, Schönauer S (2004) Efficient indexing of complex objects for density-based clustering. In: Proceedings of the 5th international workshop on multimedia data mining (MDM/KDD), Seattle, WA, pp 28–37Google Scholar
  8. 8.
    Kailing K, Kriegel H-P, Pryakhin A, Schubert M (2004) Clustering multi-represented objects with noise. In: Proceedings of the 8th Pacific-Asia conference on knowledge discovery and data mining (PAKDD'04), Sydney, Australia, pp 394–403Google Scholar
  9. 9.
    Kailing K, Kriegel, H-P, Schönauer S, Thomas S (2004) Efficient similarity search for hierachical data in large databases. In: Proceedings of the 9th international conference on extending database technology (EDBT 2004), pp 676–693Google Scholar
  10. 10.
    Kriegel H-P, Brecheisen S, Krger P, Pfeifle M, Schubert M (2003) Using sets of feature vectors for similarity search on voxelized cad objects. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD'03), San Diego, CA, pp 587–598Google Scholar
  11. 11.
    Kriegel H-P, Schönauer S (2003) Similarity search in structured data. In: Proceedings of the 5th international conference, DaWaK 2003, Prague, Czech Republic, pp 309–319Google Scholar
  12. 12.
    Kuhn H (1955) The Hungarian method for the assignment problem. Naval Res Logist Quart 2:83–97CrossRefMathSciNetGoogle Scholar
  13. 13.
    Munkres J (1957) Algorithms for the assignment and transportation problems. J SIAM 6:32–38MathSciNetGoogle Scholar
  14. 14.
    Nierman A, Jagadish HV (2002) Evaluating structural similarity in XML documents. In: Proceedings of the 5th international workshop on the web and databases (WebDB 2002), Madison, Wisconsin, USA, pp 61–66Google Scholar
  15. 15.
    Sebastian TB, Klein PN, Kimia BB (2001) Recognition of shapes by editing shock graphs. In: Proceedings of the 8th international conference on computer vision (ICCV'01), Vancouver, BC, Canada, vol 1, pp 755–762Google Scholar
  16. 16.
    Traina C Jr., Traina A, Seeger B, Faloutsos C (2000) Slim-trees: high performance metric trees minimizing overlap between nodes. In: Proceedings of the 7th international conference on extending database technology, Konstanz, Germany, March 27–31, 2000, pp 51–65Google Scholar
  17. 17.
    Wang JTL, Zhang K, Chang G, Shasha D (2002) Finding approximate patterns in undirected acyclic graphs. Pattern Recog 35(2):473–483MATHCrossRefGoogle Scholar
  18. 18.
    Zhang K, Wang J, Shasha D (1996) On the editing distance between undirected acyclic graphs. Int J Found Comput Sci 7(1):43–57MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd 2006

Authors and Affiliations

  • Karin Kailing
    • 1
  • Hans-Peter Kriegel
    • 1
  • Martin Pfeifle
    • 1
  • Stefan Schönauer
    • 1
  1. 1.Institute for Computer ScienceUniversity of MunichMunichGermany

Personalised recommendations