The Geometric Framework for Exact and Similarity Querying XML Data

  • Michal Krátký
  • Jaroslav Pokorný
  • Tomáš Skopal
  • Václav Snášel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2510)

Abstract

Using the terminology usual in databases, it is possible to view XML as a language for data modeling. To retrieve XML data from XML databases, several query languages have been proposed. The common feature of such languages is the use of regular path expressions. They enable the user to navigate through arbitrary long paths in XML data. If we considered a path content as a vector of path elements, we would be able to model XML paths as points within a multidimensional vector space. This paper introduces a geometric framework for indexing and querying XML data conceived in this way. In consequence, we can use certain data structures for indexing multidimensional points (objects). We use the UB-tree for indexing the vector spaces and the M-tree for indexing the metric spaces. The data structures for indexing the vector spaces lead rather to exact matching queries while the structures for indexing the metric spaces allow us to provide the similarity queries.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ciaccia P, Pattela M., Zezula P.: M-tree: An Efficient Access Method for Similarity Search in Metric Spaces. Proc. 23rd Athens Intern. Conf. on VLDB (1997), 426–435.Google Scholar
  2. 2.
    Bayer R.: The Universal B-Tree for multidimensional indexing: General Concepts. In: Proc. Of World-Wide Computing and its Applications 97 (WWCA 97). Tsukuba, Japan, 1997.Google Scholar
  3. 3.
    Böhm C., Berchtold S., Keim D.A.: Searching in High-dimensional Spaces-Index Structures for Improving the Performance of Multimedia Databases. ACM, 2002Google Scholar
  4. 4.
    Beckmann, N., Kriegel, H.-P., Schneider, R., Seeger, B.: The R*-tree: An efficient and robust access method for points and rectangles. In: Sigmod’90, Atlantic City, NY, 1990, pp. 322–331.Google Scholar
  5. 5.
    Brechtold, S., Keim, A., Kriegel, H.-P.: The X-tree: An index structure for high-dimensional data. In: Proc. of 22nd Intern. Conference on VLDB’96, Bombay, India, 1996, pp. 28–39.Google Scholar
  6. 6.
    Bourret, R.: XML and Databases. http://www.rpbourret.com/xml/XMLAndDatabases.htm. 2001.
  7. 7.
    Bozkaya, T., Özsoyoglu, M.: Distance-based indexing for high-dimensional metric spaces. In: Sigmod’ 97, Tuscon, AZ, 1997, pp. 357–368.Google Scholar
  8. 8.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley, New York, 1999.Google Scholar
  9. 9.
    Krátký M., Pokorný, J., Snášel V.: Indexing XML data with UB-trees. ADBIS 2002, Bratislava, Slovakia, accepted.Google Scholar
  10. 10.
    Lee, D.L., Kim, Y.M., Patel, G.: Efficient Signature File Methods for Text Retrieval., Knowledge and Data Engineering, Vol. 7, No. 3, 1995, pp. 423–435.CrossRefGoogle Scholar
  11. 11.
    Markl, V.: Mistral: Processing Relational Queries using a Multidimensional Access Technique, http://mistral.in.tum.de/results/publications/Mar99.pdf, 1999
  12. 12.
    The DocBook open standard, Organization for the Advancement of Structured Information Standards (OASIS), 2002, http://www.oasis-open.org/committees/docbook
  13. 13.
    M. Patella: Similarity Search in Multimedia Databases. Dipartmento di Elettronica Informatica e Sistemistica, Bologna 1999 http://www-db.deis.unibo.it/~patella/MMindex.html Google Scholar
  14. 14.
    Pokorný, J.: XML: a challenge for databases?, Chap. 13 In: Contemporary Trends in Systems Development (Eds.: Maung K. Sein), Kluwer Academic Publishers, Boston, 2001, pp. 147–164.Google Scholar
  15. 15.
    XQuery 1.0: An XML Query Language. W3C Working Draft 20 December 2001, http://www.w3.org/TR/2001/WD-xquery-20011220/

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Michal Krátký
    • 1
  • Jaroslav Pokorný
    • 2
  • Tomáš Skopal
    • 1
  • Václav Snášel
    • 1
  1. 1.Department of Computer ScienceVŠB-Technical University of OstravaCzech Republic
  2. 2.Department of Software EngineeringCharles UniversityPragueCzech Republic

Personalised recommendations