Advertisement

XML Query Processing Using a Schema-Based Numbering Scheme

  • Dao Dinh Kha
  • Masatoshi Yoshikawa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3186)

Abstract

Establishing the hierarchical order among XML elements is an essential function of XML query processing techniques. Although most XML documents have an associated DTD or XML schema, the document structure information has not been utilized efficiently in query processing techniques proposed so far. In this paper, we propose a novel technique that uses DTD or XML schema to improve the disk I/O complexity of XML query processing. We present a schema-based numbering scheme called SPIDER that incorporates both structure information and tag names extracted from the document structure descriptions. Given the tag name and the identifier of an element, SPIDER can determine the tag names and the identifiers of the ancestor elements without disk I/O. Based on SPIDER, we designed a mechanism called VirtualJoin that significantly reduces disk I/O workload for processing XML queries. Our experiments indicated that SPIDER outperforms the structural join techniques Stack-Tree and PathStack in XML query processing, especially for XML queries with heavy join workload and large data sets.

Keywords

Index Data Child Node Parent Node Numbering Scheme Path Expression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Wei, W., et al.: PBiTree Coding and Efficient Processing of Containment Join. In: ICDE, India (2003)Google Scholar
  2. 2.
    Jiang, H., Lu, H., Wang, W., Ooi, B.C.: XR-Tree: Indexing XML data for Efficient Structural Joins. In: ICDE, India (2003)Google Scholar
  3. 3.
    Chien, S., et al.: Efficient Structural Joins on Indexed XML Documents. In: VLDB, Hong kong (2002)Google Scholar
  4. 4.
    Schmidt, A.: XMark: A Benchmark for XML Data Management. In: VLDB, Hong kong (2002)Google Scholar
  5. 5.
    Tatarinov, I., et al.: Storing and Querying Ordered XML Using a Relational Database System. In: SIGMOD, USA (2002)Google Scholar
  6. 6.
    Flesca, S., Furfaro, F., Masciari, E.: On the minimization of XPath Queries. In: VLDB, Germany (2003)Google Scholar
  7. 7.
    Buneman, P., Davidson, S., Fernandez, M., Suciu, D.: Adding Structure to Unstructured Data. In: ICDT, Greece (1997)Google Scholar
  8. 8.
    Lee, Y.K., Yoo, S.-J., Yoon, K., Berra, P.B.: Index Structures for structured documents. In: ICDL, USA (1996)Google Scholar
  9. 9.
    Zhang, C., et al.: On Supporting Containment Queries in Relational Database Management Systems. In: SIGMOD, USA (2001)Google Scholar
  10. 10.
    Li, Q., Moon, B.: Indexing and Querying XML Data for Regular Path Expressions. In: VLDB, Italy (2001)Google Scholar
  11. 11.
    Grust, T.: Accelerating XPath Location Steps. In: SIGMOD, USA (2002)Google Scholar
  12. 12.
    Al-Khalifa, S., et al.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: ICDE, USA (2003)Google Scholar
  13. 13.
    Bruno, N., Koudas, N., Srivastava, D.: Holistic Twig Joins: Optimal XML Pattern Matching. In: SIGMOD, USA (2002)Google Scholar
  14. 14.
    Goldman, R., Widom, J.: DataGuides: enabling query formulation and optimization in semistructured databases. In: VLDB (1997)Google Scholar
  15. 15.
    Milo, T., Suciu, D.: Index Structures for Path Expression. In: ICDT (1999)Google Scholar
  16. 16.
    W3C, Extensible Markup Language 1.0 (2000), http://www.w3.org/TR/REC-xml
  17. 17.
    W3C, XML Path Language version 1.0 (2000), http://www.w3.org/TR/xpath
  18. 18.
    Apache Software Foundation, Apache XML Project (2001), http://xml.apache.org/
  19. 19.
    Yoshikawa, M., et al.: XRel: A Path-Based Approach to Storage and Retrieval of XML Documents Using Relational Databases. ACM TOIT 1(1) (2001)Google Scholar
  20. 20.
    Kha, D.D., Yoshikawa, M., Uemura, S.: Virtual Joins for XML Data, NAIST Technical Report IS-TR2003012 (November 2003) Google Scholar
  21. 21.
    Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press, Cambridge (1997)zbMATHCrossRefGoogle Scholar
  22. 22.
    Chen, Y., Davidson, S.B., Zheng, Y.: BLAS: An Efficient XPath Processing System. In: SIGMOD, France (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Dao Dinh Kha
    • 1
  • Masatoshi Yoshikawa
    • 2
  1. 1.Japan IMI Project of COE Program, Graduate School of Information ScienceNagoya University 
  2. 2.Japan Information Technology CenterNagoya University 

Personalised recommendations