Advertisement

Clustered Chain Path Index for XML Document: Efficiently Processing Branch Queries

  • Hongqiang Wang
  • Jianzhong Li
  • Hongzhi Wang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4255)

Abstract

Branch query processing is a core operation of XML query processing. In recent years, a number of stack based twig join algorithms have been proposed to process twig queries based on tag stream index. However, each element is labeled separately in tag stream index, similarity of same structured elements is ignored; besides, algorithms based on tag stream index perform worse on large document. In this paper, we propose a novel index Clustered Chain Path Index (CCPI for brief) based on a novel labeling scheme: Clustered Chain Path labeling. The index provides good properties for efficiently processing branch queries. It also has the same cardinality as 1-index against tree structured XML document. Based on CCPI, we design efficient algorithms KMP-Match-Path to process queries without branches and Related-Path-Segment-Join to process queries with branches. Experimental results show that proposed query processing algorithms based on CCPI outperform other algorithms and have good scalability.

Keywords

Query Processing Query Plan Path Query Path Expression XPath Expression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    XML Path Language (XPath) 2.0, http://www.w3.org/TR/xpath20/
  2. 2.
    XQuery 1.0: An XML query language, http://www.w3.org/TR/xquery/
  3. 3.
    Bruno, N., Srivastava, D., Koudas, N.: Holistic twig joins: optimal XML pattern matching. In: SIGMOD Conference, pp. 310–321 (2002)Google Scholar
  4. 4.
    Jiang, H., et al.: Holistic twig joins on indexed XML documents. In: Proc. of VLDB, pp. 273–284 (2003)Google Scholar
  5. 5.
    Lu, J.H., Chen, T., Ling, T.W.: Efficient processing of XML twig patterns with parent child edges: a look-ahead approach. In: Proceedings of CIKM Conference 2004, pp. 533–542 (2004)Google Scholar
  6. 6.
    Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proc. of VLDB, pp. 361–370 (2001)Google Scholar
  7. 7.
    Milo, T., Dan Suciu, D.: Index structures for path expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  8. 8.
    Miklau, G., Suciu, D.: Containment and equivalence for an XPath fragment. In: PODS, pp. 65–76 (2002)Google Scholar
  9. 9.
    Lu, J., Ling, T.W., Chan, C.Y., Chen, T.: From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching. In: Proc. of VLDB, pp. 193–204 (2003)Google Scholar
  10. 10.
    Chen, Y., Davidson, S.B., Zheng, Y.: BLAS: An efficient XPath processing system. In: Proc. of SIGMOD, pp. 47–58 (2004)Google Scholar
  11. 11.
    Jiang, H., Wang, W., Lu, H., Yu, J.X.: Holistic twig joins on indexed XML documents. In: Proceeding of VLDB 2003, pp. 273–284 (2003)Google Scholar
  12. 12.
    Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting local similarity for efficient indexing of paths in graph structured data. In: ICDE 2002 (2002)Google Scholar
  13. 13.
    Qun, C., Lim, A., Ong, K.W.: D(k)-index: An adaptive structural summary for graph-structured data. In: ACM SIGMOD, pp. 134–144 (2003)Google Scholar
  14. 14.
    He, H., Yang, J.: Multi resolution indexing of XML for frequent queries. In: ICDE 2004 (2004)Google Scholar
  15. 15.
    Kaushik, R., Bohannon, P., Naughton, J.F., Korth, H.F.: Covering indexes for branching path queries. In: SIGMOD 2002 (2002)Google Scholar
  16. 16.
    XMark: The XML-benchmark project, http://monetdb.cwi.nl/xml
  17. 17.
    Zhang, N., Kacholia, V., Özsu, M.T.: A succinct physical storage scheme for efficient evaluation of path queries in XML. In: ICDE 2004, pp. 54–65 (2004)Google Scholar
  18. 18.
    U. of Washington XML Repository, http://www.cs.washington.edu/research/xmldatasets/
  19. 19.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. The MIT Press, Cambridge (2001)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Hongqiang Wang
    • 1
  • Jianzhong Li
    • 1
  • Hongzhi Wang
    • 1
  1. 1.School of Computer Science and TechnologyHarbin Institute of TechnologyHarbin

Personalised recommendations