Skip to main content
Log in

Efficient updates in dynamic XML data: from binary string to quaternary string

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

XML query processing based on labeling schemes has been thoroughly studied in the past several years. Recently efficient processing of updates in dynamic XML data has gained more attention. However, all the existing techniques have high update cost, they cannot completely avoid re-labeling in XML updates, and they will increase the label size which will influence the query performance. Thus, in this paper we propose a novel Compact Dynamic Binary String (CDBS) encoding to efficiently process updates. CDBS has two important properties which form the foundations of this paper: (1) CDBS supports that CDBS codes can be inserted between any two consecutive CDBS codes with orders kept and without re-encoding the existing codes; (2) CDBS is orthogonal to specific labeling schemes; thus it can be applied broadly to different labeling schemes or other applications to efficiently process updates. Moreover, because CDBS will encounter the overflow problem, we improve CDBS to Compact Dynamic Quaternary String (CDQS) encoding which can completely avoid re-labeling in XML leaf node updates no matter what the labeling schemes are. Meanwhile, we also discuss how to efficiently process internal node updates. We report the experimental results to show that our CDBS and CDQS are superior to previous approaches to process both leaf node and internal node updates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abiteboul, S., Kaplan, H., Milo, T.: Compact labeling schemes for ancestor queries. In: Proceedings. of the 12th annual ACM-SIAM Symposium on Discrete Algorithms (SODA’01), pp. 547–556 (2001)

  2. Abiteboul, S., Vianu, V.: Regular path queries with constraints. In: Proceedings of the 16th ACM Symposium on Principles of Database Systems (PODS’97), pp. 122–133, (1997)

  3. Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’89), pp. 253–262 (1989)

  4. Al-Khalifa, S., Jagadish, H.V., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural joins: a primitive for efficient XML query pattern matching. In: Proceedings of the 18th International Conference on Data Engineering (ICDE’02), pp. 141–152 (2002)

  5. Amagasa, T., Yoshikawa, M., Uemura, S.: QRS: a robust numbering scheme for XML documents. In: Proceedings of the 19th International. Conference on Data Engineering (ICDE’03), pp. 705–707 (2003)

  6. Anderson J.A., Bell J.M. (1997) Number Theory with Application. Prentice-Hall, New Jersey

    Google Scholar 

  7. Berglund, A., Boag, S., Chamberlin, D., Fernandez, M.F., Kay, M. Robie, J., Simon, J.: XML path language (XPath) 2.0. W3C working draft 04, (2005)

  8. Boag, S., Chamberlin, D., Fernandez, M.F., Florescu, D., Robie, J., Simon, J.: XQuery 1.0: an XML query language. W3C working draft 04, (2005)

  9. Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., , F., Cowan, J.: Extensible markup language (XML) 1.1. W3C recommendation, (2004)

  10. Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: Proceedings of the ACM SIGMOD International. Conference on Management of Data (SIGMOD’02), pp. 310–321 (2002)

  11. Catania, B., Wang, W.Q., Ooi, B.C., Wang. X.: Lazy XML updates: laziness as a virtue of update and structural join efficiency. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’05), (2005)

  12. Chen, T., Lu, J., Ling, T.W.: On boosting holism in XML twig pattern matching using structural indexing techniques. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’05), (2005)

  13. Chien, S.-Y., Vagena, Z., Zhang, D., Tsotras, V.J., Zaniolo, C.: Efficient Structural Joins on Indexed XML Documents. In: Proceedings of the 28th International Conference on Very Large Data Bases (VLDB’02), pp. 263–274 (2002)

  14. Cohen, E., Kaplan, H., Milo, T.: Labeling dynamic XML trees. In: Proceedings of the 21st ACM Symposium on Principles of Database Systems (PODS’02), pp. 271–281 (2002)

  15. Dietz, P.F.: Maintaining order in a linked list. In: Proceedings of the 14th Annual ACM Symposium on Theory of Computing (STOC’82), pp. 122–127 (1982)

  16. Fernandez, M., Suciu, D.: Optimizing regular path expres-sions using graph schemas. In: Proceedings of the 14th International. Conference on Data Engineering (ICDE’98), pp. 14–23 (1998)

  17. Gottlob, G., Koch, C., Pichler, R.: XPath query evaluation: improving time and space efficiency. In: Proceedings of the 19th International. Conference on Data Engineering (ICDE’03), pp. 379–390 (2003)

  18. Halverson, A., Burger, J., Galanis, L., Kini, A., Krishnamurthy, R., Rao, A.N., Tian, F., Viglas, S., Wang, Y., Naughton, J.F., DeWitt, D.J.: Mixed mode XML query processing. In: Proceedings of the 29th International. Conference on Very Large Data Bases (VLDB’03), Berlin, Germany, September 2003, pp. 225–236

  19. Jagadish H.V., Al-Khalifa S., Chapman A., Lakshmanan L.V.S., Nierman A., Paparizos S., Patel J.M., Srivastava D. Wiwatwattana N., Wu Y., Yu C. (2002) TIMBER: a native XML database. VLDB J. 11(4): 274–291

    Article  MATH  Google Scholar 

  20. Jiang, H., Lu, H., Wang, W., Ooi, B.C., XR-Tree: Indexing XML data for efficient structural joins. In: Proceedings of the ACM SIGMOD International. Conference on Management of Data (SIGMOD’03), pp. 253–263 (2003)

  21. Jiao, E., Ling, T.W., Chan, C.Y., PathStack : a holistic path join algorithm for path query with not-predicates on XML Data. In: Proceedings of the 10th International Conference on Database Systems for Advanced Applications (DASFAA’05), pp. 113–124 (2005)

  22. Kha, D.D., Yoshikawa, M., Uemura, S.: A Structural Number-ing Data. In: Proceedings of the 8th International Conference on Extending Database Technology (EDBT’02) Workshop, pp. 91–108 (2002)

  23. Kha, D.D., Yoshikawa, M., Uemura, S.: An XML Indexing structure with relative region coordinate. In: Proceedings of the 17th International Conference on Data Engineering (ICDE’01), pp. 313–320 (2001)

  24. Li, C., Ling, T.W.: QED: a novel quaternary en-coding to completely avoid re-labeling in XML updates. In: Proceedings of the 14th International. Conference on Information and Knowledge Management (CIKM’05), pp. 501–508 (2005)

  25. Li, C., Ling, T.W., Hu, M.: Efficient processing of updates in dynamic XML data. In: Proceedings of the 22nd International Conference on Data Engineering (ICDE’06) (2006)

  26. Li, Q., Moon, B.: Indexing and querying XML data for regu-lar path expressions. In: Proceedings of the 27th International Conference on Very Large Data Bases (VLDB’01), pp. 361–370 (2001)

  27. Lu, J., Chen, T., Ling, T.W.: Efficient processing of XML twig patterns with parent child edges: a look-ahead approach. In: Proceedings of the 13th International Conference on Information and Knowledge Management (CIKM’04), pp. 533–542 (2004)

  28. Milo, T., Suciu, D.: Index structures for path expressions. In: Proceedings of the 7th International Conference on Database Theory (ICDT’99), pp. 277–295 (1999)

  29. NIAGARA Experimental Data. Available at: http://www.cs.wisc.edu/niagara/data.html

  30. O’Neil, P.E., O’Neil, E.J., Pal, S., Cseri, I., Schaller, G., Westbury, N.: ORDPATHs: Insert-friendly XML node labels. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’04), pp. 903–908 (2004)

  31. Rao, P., Moon, B.: PRIX: Indexing And Querying XML Using Prüfer Sequences. In Proceedings of the 20th International Conference on Data Engineering (ICDE’04), pp. 288–300 (2004)

  32. Santoro N., Khatib R. (1985) Labeling and implict routing in networks. Computer J., 28, 5–8

    Article  MATH  MathSciNet  Google Scholar 

  33. Silberstein, A., He, H., Yi, K., Yang, J.: BOXes: efficient main-tenance of order-based labeling for dynamic XML data. In: Proceedings of the 21st International Conference on Data Engineering (ICDE’05), pp. 285–296 (2005)

  34. Tatarinov, I., Ives, Z.G., Halevy, A.Y., Weld, D.S.: Updating XML. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’01) (2001)

  35. Tatarinov, I., Viglas, S., Beyer, K.S., Shanmugasundaram, J., Shekita, E.J., Zhang, C.: Storing and querying ordered XML using a relational database system. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’02), pp. 204–215 (2002)

  36. University of Washington XML Repository. Available at: http://www.cs.washington.edu/research/xmldatasets/

  37. Wang, H., Park, S., Fan, W., Yu, P.S.: ViST: A dynamic index method for querying XML data by tree structures. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’03), pp. 110–121 (2003)

  38. Wu, X., Lee, M.L., Hsu, W.: A prime number labeling scheme for dynamic ordered XML trees. In: Proceedings of the 20th International Conference on Data Engineering (ICDE’04), pp. 66–78 (2004)

  39. Xing, G., Tseng, B.: Extendible range-based numbering scheme for xml document. In: Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC’04), pp. 140–141 (2004)

  40. XMark - An XML Benchmark Project. Available at: http://monetdb.cwi.nl/xml/downloads.html

  41. Yergeau, F.: UTF8: a transformation format of ISO 10646. Request for Comments (RFC) 2279, January 1998

  42. Yi, K., He, H., Stanoi, I., Yang, J.: Incremental maintenance of XML structural indexes. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’04), pp. 491–502 (2004)

  43. Yoshikawa M., Amagasa T., Shimura T., Uemura S. (2001) XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM Trans. Internet Techn. 1(1): 110–141

    Article  Google Scholar 

  44. Zhang, N., Kacholia, V., Özsu, M. T.: A succinct physical stor-age scheme for efficient evaluation of path queries in XML. In: Proceedings of the 20th International Conference on Data Engineering (ICDE’04), pp. 54–65 (2004)

  45. Zhang, C., Naughton, J.F., DeWitt, D.J., Luo, Q., Lohman, G.: On supporting containment queries in relational database man-agement systems. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’01), pp. 425–436 (2001)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Changqing Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, C., Ling, T.W. & Hu, M. Efficient updates in dynamic XML data: from binary string to quaternary string. The VLDB Journal 17, 573–601 (2008). https://doi.org/10.1007/s00778-006-0021-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-006-0021-2

Keywords

Navigation