World Wide Web

, Volume 8, Issue 1, pp 5–26 | Cite as

Dynamically Updating XML Data: Numbering Scheme Revisited

  • Jeffrey Xu Yu
  • Daofeng Luo
  • Xiaofeng Meng
  • Hongjun Lu

Abstract

Almost all existing approaches use certain numbering scheme to encode XML elements to facilitate query processing when XML data is stored in databases. For example, under the most popular region-based numbering scheme, the starting and ending positions of an element in a document are used as the code to identify the element so that the ancestor/descendant relationship between two elements can be determined by merely examining their codes. While such numbering scheme can greatly improve query performance, renumbering large amount of elements caused by updates becomes a performance bottleneck if XML documents are frequently updated. Unfortunately, no satisfactory work has been reported for efficient update of XML data. In this paper, we first formalize the XML data update problem by defining the basic operators to support most XML update queries. We then present a new numbering scheme that not only requires minimal code-length in comparison with existing numbering schema but also improves update performance when XML data is frequently updated at arbitrary positions. The fundamental difference between our new scheme and existing ones is that, instead of maintaining the explicit codes for elements, we only store the necessary information and generate the codes when they are needed in query processing. In addition to present the basic scheme, we also discuss some optimization techniques to further reduce the update cost. Results of a comprehensive performance study are provided to show the advantages of the new scheme.

XML data updates numbering scheme 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    S. Alstrup and T. Rauhe, “Improved labeling scheme for ancestor queries,” in Proceedings of SODA’02, 2002.Google Scholar
  2. [2]
    D. Chamberlin, D. Florescu, J. Robie, J. Siméon, and M. Stefanescu, “XQuery: A query language for XML,” W3C Working Draft, http://www.w3.org/TR/xquery, 2001.Google Scholar
  3. [3]
    J. Clark and S. DeRose, “XML path language (XPath),” W3C Recommendation, 16 November 1999, http://www.w3.org/TR/xpathGoogle Scholar
  4. [4]
    E. Cohen, H. Kaplan, and T. Milo, “Labeling dynamic XML trees,” in Proceedings of PODS’02, 2002.Google Scholar
  5. [5]
    H. V. Jagadish, L. V. S. Lakshmanan, D. Srivastava, and K. Thompson, “Taxz: A tree algebra for XML,” in Proceedings of 8th International Workshop on DBLP, 2001.Google Scholar
  6. [6]
    H. Kaplan, T. Milo, and R. Shabo, “A comparison of labeling schemes for ancestor queries,” in Proceedings of SODA’02, 2002.Google Scholar
  7. [7]
    D. D. Kha, M. Yoshikawa, and S. Uemura, “An XML indexing structure with relative region coordinate,” in Proceedings of ICDE’01, 2001.Google Scholar
  8. [8]
    W. E. Kimber, “HyTime and SGML: Understanding the HyTime HYQ query language 1.1,” Technical Report, IBM Corporation, 1993.Google Scholar
  9. [9]
    D. E. Knuth, Fundamental Algorithms, The Art of Computer Programming, Vol. 1, 2nd ed., Addison-Wesley, 1973.Google Scholar
  10. [10]
    Y. K. Lee, S. J. Yoo, and K. Yoon, “Index structures for structured documents,” in Proceedings of ACM 1st International Conference on Digital Libraries, 1996.Google Scholar
  11. [11]
    S. Lei, G. Özsoyoglu, and Z. M. Özsoyoglu, “A graph query language and its query processing,” in Proceedings of ICDE’99, 1999.Google Scholar
  12. [12]
    Q. Li and B. Moon, “Indexing and querying XML data for regular path expressions,” in Proceedings of VLDB’01, 2001.Google Scholar
  13. [13]
    A. Schmidt, F. Waas, M. Kersten, M. J. Carey, I. Manolescu, and R. Busse, “The XMark: A benchmark for XML data management,” in Proceedings of VLDB’02, 2002.Google Scholar
  14. [14]
    I. Tatarnov, Z. G. Ives, A. Y. Halevy, and D. S. Wed, “Updating XML,” in Proceedings of SIGMOD’01, 2001.Google Scholar
  15. [15]
    I. Tatarnov, S. D. Viglas, K. Beyer, J. Shanmugasundaram, E. Shekita, and C. Zhang, “Storing and quering ordered XML using a relational database system,” in Proceedings of SIGMOD’02, 2002.Google Scholar
  16. [16]
    W. Wang, H. Jiang, H. Lu, and J. X. Yu, “Pbitree coding and efficient processing of containment join,” in Proceedings of ICDE’03, 2003.Google Scholar
  17. [17]
    M. Yoshikawa and T. Amagasa, “XRel: A path-based approach to storage and retrieval of XML documents using relational databases,” ACM Transactions on Internet Technology 1(1), 2001.Google Scholar
  18. [18]
    C. Zhang, J. F. Naughton, D. J. DeWitt, Q. Luo, and G. M. Lohman, “On supporting containment queries in relational database management systems,” in Proceedings of SIGMOD’01, 2001.Google Scholar

Copyright information

© Springer Science + Business Media, Inc. 2005

Authors and Affiliations

  • Jeffrey Xu Yu
    • 1
  • Daofeng Luo
    • 2
  • Xiaofeng Meng
    • 2
  • Hongjun Lu
    • 3
  1. 1.Chinese University of Hong KongHong KongChina
  2. 2.Renmin University of ChinaBeijingChina
  3. 3.Hong Kong University of Science and TechnologyHong KongChina

Personalised recommendations