Efficient Incremental Validation of XML Documents After Composite Updates

  • Denilson Barbosa
  • Gregory Leighton
  • Andrew Smith
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4156)


We describe an efficient method for the incremental validation of XML documents after composite updates. We introduce the class of Bounded-Edit (BE) DTDs and XML Schemas, and give a simple incremental revalidation algorithm that yields optimal performance for them, in the sense that its time complexity is linear in the number of operations in the update. We give extensive experimental results showing that our algorithm exhibits excellent scalability. Finally, we provide a statistical analysis of over 250 DTDs and XML Schema specifications found on the Web, showing that over 99% of them are in fact in BE.


Regular Expression Content Model Incremental Validation Document Schema Atomic Operation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web. Morgan Kaufmann, San Francisco (1999)Google Scholar
  2. 2.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley, Reading (1999)Google Scholar
  3. 3.
    Balmin, A., Papakonstantinou, Y., Vianu, V.: Incremental Validation of XML Documents. ACM Transactions on Database Systems 29(4), 710–751 (2004), Extended version of [16]Google Scholar
  4. 4.
    Barbosa, D., Mendelzon, A.O., Libkin, L., Mignet, L., Arenas, M.: Efficient Incremental Validation of XML Documents. In: Proceedings of the 20th International Conference on Data Engineering, Boston, MA, USA, pp. 671–682. IEEE Computer Society, Los Alamitos (2004)CrossRefGoogle Scholar
  5. 5.
    Bex, G.J., Neven, F., den Bussche, J.V.: DTDs versus XML Schema: A Practical Study. In: Proceedings of the Seventh International Workshop on the Web and Databases, WebDB 2004, Maison de la Chimie, Paris, France, June 17-18, pp. 79–84 (2004)Google Scholar
  6. 6.
    Bouchou, B., Halfeld-Ferrari-Alvez, M.: Updates and incremental validation of XML documents. In: Lausen, G., Suciu, D. (eds.) DBPL 2003. LNCS, vol. 2921, pp. 216–232. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Brauer, M., Durusau, P., Edwards, G., Faure, D., Magliery, T., Vogelheim, D.: Open Document Format for Office Applications (OpenDocument) v1.0. OASIS standard, Organization for the Advancement of Structured Information Standards (OASIS) (May 1, 2005)Google Scholar
  8. 8.
    Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F.: Extensible Markup Language (XML) 1.0. World Wide Web Consortium, 3rd edn., February 4 (2004), http://www.w3.org/TR/2004/REC-xml-20040204
  9. 9.
    Brüggemann-Klein, A., Wood, D.: One-Unambiguous Regular Languages. Information and Computation 142, 182–206 (1998)MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Chamberlin, D., Florescu, D., Robie, J.: XQuery Update Facility. W3C Working Draft (May 8, 2006)Google Scholar
  11. 11.
    Kane, B., Su, H., Rundensteiner, E.A.: Consistently Updating XML Documents Using Incremental Constraint Check Queries. In: Fourth ACM CIKM International Workshop on Web Information and Data Management, McLean, Virginia, USA, November 8, pp. 1–8 (2002)Google Scholar
  12. 12.
    Libkin, L., Wong, L.: On the Power of Incremental Evaluation in SQL-Like Languages. In: Connor, R.C.H., Mendelzon, A.O. (eds.) DBPL 1999. LNCS, vol. 1949, pp. 17–30. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  13. 13.
    Office 2003 XML Reference Schema (2006), http://www.microsoft.com/office/xml
  14. 14.
    National Library of Medicine (2004), http://www.nlm.nih.gov/
  15. 15.
    Papakonstantinou, Y., Vianu, V.: DTD Inference for Views of XML Data. In: Proceedings of the 19th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, May 15-18, pp. 35–46 (2000)Google Scholar
  16. 16.
    Papakonstantinou, Y., Vianu, V.: Incremental Validation of XML Documents. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572, pp. 47–63. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  17. 17.
    Patnaik, S., Immerman, N.: Dyn-FO: A Parallel, Dynamic Complexity Class. J. Comput. Syst. Sci. 55(2), 199–209 (1997)CrossRefMathSciNetGoogle Scholar
  18. 18.
    PIR Non-Redundant Reference Sequence Database (PIR-NREF) (October 14, 2004), http://pir.georgetown.edu/pirwww/search/pirnref.shtml
  19. 19.
    Segoufin, L.: Typing and Querying XML Documents: Some Complexity Bounds. In: Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, San Diego, CA, USA, June 9-11, pp. 167–178 (2003)Google Scholar
  20. 20.
    Thompson, H.S., Beech, D., Maloney, M., N.M. (eds.): XML Schema Part 1: Structures. World Wide Web Consortium May 2 (2001), http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/
  21. 21.
    The XML benchmark project, http://www.xml-benchmark.org/
  22. 22.
    Yu, S.: Regular Languages. In: Rozenberg, G., Saloma, A. (eds.) Handbook of Formal Languages, vol. 1, pp. 41–110. Springer, Heidelberg (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Denilson Barbosa
    • 1
  • Gregory Leighton
    • 1
  • Andrew Smith
    • 1
  1. 1.University of CalgaryCalgaryCanada

Personalised recommendations