Incremental Validation of String-Based XML Data in Databases, File Systems, and Streams

  • Beda Christoph Hammerschmidt
  • Christian Werner
  • Ylva Brandt
  • Volker Linnemann
  • Sven Groppe
  • Stefan Fischer
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4690)

Abstract

Although the native (tree-like) storage of XML data becomes more and more important there will be an enduring demand to manage XML data in its textual representation, for instance in relational structures or file systems. XML data has to be wellformed by definition and additionally, in many cases, it has to be valid according to a given XML schema. Because the XML column types are often derived from text types (e.g. CLOBs) guaranteeing well-formedness as well as validity is not trivial. And even worse, for frequently modified data it is usually too expensive to re-validate the whole XML data after each update – but waiving re-validation may lead to inconsistencies and malfunctions of applications. In this paper we present a schema-aware pushdown automaton (i.e. a stack machine) that validates an XML string/stream. Using an element/state-index, the pushdown automaton is able to re-validate local modifications of the data while guaranteeing overall validity. Update operations (e.g. SQLXML, XQuery updates) are validated before executing them.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Altova. XMLSpy, URL: http://www.altova.com
  2. 2.
    Balmin, A., Papakonstantinou, Y., Vianu, V.: Incremental validation of XML documents. ACM Trans. Database Syst. 29(4), 710–751 (2004)CrossRefGoogle Scholar
  3. 3.
    Barbosa, D., Mendelzon, A.O., Libkin, L., Mignet, L., Arenas, M.: Efficient Incremental Validation of XML Documents. In: ICDE 2004. Proceedings of the 20th International Conference on Data Engineering, Washington, DC, USA, pp. 671–682. IEEE Computer Society Press, Los Alamitos (2004)Google Scholar
  4. 4.
    Beyer, K., Cochrane, R., Josifovski, V., Kleewein, J., Lapis, G., Lohman, G., Lyle, B., Özcan, F., Pirahesh, H., Seemann, N., Truong, T.: System RX: One Part Relational, One Part XML. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, June 14-16 2005, pp. 347–358. ACM Press, New York (2005)CrossRefGoogle Scholar
  5. 5.
    Bouchou, B., Alves, M.H.F.: Updates and Incremental Validation of XML Documents. In: DBPL, pp. 216–232 (2003)Google Scholar
  6. 6.
    Bouchou, B., Alves, M.H.F., Laurent, D., Duarte, D.: Extending Tree Automata to Model XML Validation Under Element and Attribute Constraints. In: ICEIS (1), pp. 184–190 (2003)Google Scholar
  7. 7.
    Brüggemann-Klein, A., Wood, D.: Balanced context-free grammars, hedge grammars and pushdown caterpillar automata. In: Extreme Markup Languages (2004)Google Scholar
  8. 8.
    Chitic, C., Rosu, D.: On validation of XML streams using finite state machines. In: WebDB 2004. Proceedings of the 7th International Workshop on the Web and Databases, pp. 85–90. ACM Press, New York, NY, USA (2004)CrossRefGoogle Scholar
  9. 9.
    Megginson, D.: Simple API for XML, URL: http://www.saxproject.org/
  10. 10.
    Fiebig, T., Helmer, S., Kanne, C.-C., Moerkotte, G., Neumann, J., Schiele, R., Westmann, T.: Anatomy of a native XML base management system. VLDB Journal 11(4), 292–314 (2002)MATHCrossRefGoogle Scholar
  11. 11.
    Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: VLDB 1997, Proceedings of 23rd International Conference on Very Large Data Bases, pp. 436–445 (1997)Google Scholar
  12. 12.
    Grust, T., Klinger, S.: Schema validation and type annotation for encoded trees. In: Proceedings of the First International Workshop on XQuery Implementation (XIME-P), Paris, France, June 2004, pp. 55–60 (2004)Google Scholar
  13. 13.
    Hammerschmidt, B.C.: KeyX: Selective Key-Oriented Indexing in Native XML-Databases. Dissertation zum Dr.-Ing., Institut für Informationssysteme, Technisch-Naturwissenschaftliche Fakultät, Universität zu Lübeck, October, DISDBIS 93, Akademische Verlagsgesellschaft Aka GmbH, Berlin 2006, ISBN 3-89838-493-4 (2005)Google Scholar
  14. 14.
    Hammerschmidt, B.C., Kempa, M., Linnemann, V.: A selective key-oriented XML Index for the Index Selection Problem in XDBMS. In: Galindo, F., Takizawa, M., Traunmüller, R. (eds.) DEXA 2004. LNCS, vol. 3180, Springer, Heidelberg (2004)Google Scholar
  15. 15.
    Hammerschmidt, B.C., Kempa, M., Linnemann, V.: Autonomous Index Optimization in XML Databases. In: Proceedings of the International Workshop on Self-Managing Database Systems (SMDB 2005), Tokyo, Japan, April 8-9 2005, pp. 56–65 (2005)Google Scholar
  16. 16.
    Hammerschmidt, B.C., Kempa, M., Linnemann, V.: On the Intersection of XPath Expressions. In: Proceedings of the 9th International Database Engineering & Application Symposium (IDEAS 2005), Montreal, Canada, July 25-27, 2005 (2005)Google Scholar
  17. 17.
    Hammerschmidt, B.C., Linnemann, V.: The Index Update Problem for XML Data in XDBMS. In: Proceedings of the 7th International Conference on Enterprise Information Systems (ICEIS 2005), Miami, USA, pp. 27–34 (2005)Google Scholar
  18. 18.
    Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison Wesley Publishing Company, Reading (2001)MATHGoogle Scholar
  19. 19.
    Hunter, J., McLaughlin.: JDOM 1.0, URL: http://www.jdom.org/
  20. 20.
    Sang-Kyun, K., Myungcheol, L., Kyu-Chul, L.: Immediate and Partial Validation Mechanism for the Conflict Resolution of Update Operations in XML Databases. In: Meng, X., Su, J., Wang, Y. (eds.) WAIM 2002. LNCS, vol. 2419, pp. 387–396. Springer, Heidelberg (2002)Google Scholar
  21. 21.
    Sang-Kyun, K., Myungcheol, L., Kyu-Chul, L.: Validation of XML Document Updates Based on XML Schema in XML Databases. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 98–108. Springer, Heidelberg (2003)Google Scholar
  22. 22.
    Liu, Z.H., Krishnaprasad, M., Arora, V.: Native Xquery processing in Oracle XMLDB. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, June 14-16 2005, pp. 828–833. ACM Press, New York (2005)CrossRefGoogle Scholar
  23. 23.
    Miklau, G., Suciu, D.: Containment and equivalence for a fragment of XPath. Journal of the ACM 51(1), 2–45 (2004)CrossRefMathSciNetGoogle Scholar
  24. 24.
    Murata, M., Lee, D., Mani, M., Kawaguchi, K.: Taxonomy of XML schema languages using formal language theory. ACM Trans. Inter. Tech. 5(4) (2005)Google Scholar
  25. 25.
    Papakonstantinou, Y., Vianu, V.: Incremental Validation of XML Documents. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572, pp. 47–63. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  26. 26.
    Schmidt, A., Waas, F., Kersten, M.L., Carey, M.J., Manolescu, I., Busse, R.: XMark: A Benchmark for XML Data Management. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), Hong Kong, China, pp. 974–985 (2002)Google Scholar
  27. 27.
    Schöning, H.: Tamino - A DBMS designed for XML. In: Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany, April 2-6, 2001, pp. 149–154. IEEE Computer Society, Los Alamitos (2001)CrossRefGoogle Scholar
  28. 28.
    Segoufin, L.: Typing and querying XML documents: some complexity bounds. In: PODS 2003. Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 167–178. ACM Press, New York (2003)CrossRefGoogle Scholar
  29. 29.
    Segoufin, L., Vianu, V.: Validating streaming XML documents. In: PODS 2002. Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 53–64. ACM Press, New York (2002)CrossRefGoogle Scholar
  30. 30.
    Sun Microsystems, Inc. Trang: Multi-format schema converter based on RELAX NG (May 2006), URL: http://www.thaiopensource.com/relaxng/trang.html
  31. 31.
    Thompson, H.S., Beech, D., Maloney, M., Mendelsohn, N.: XML Schema part 1: Structures 2 edn. W3C Recommendation (October 2004), URL: http://www.w3.org/TR/xmlschema-1
  32. 32.
    Werner, C., Buschmann, C., Brandt, Y., Fischer, S.: Compressing SOAP Messages by using Pushdown Automata. In: Proceedings of the IEEE International Conference on Web Services, Chicago, USA, September 2006, IEEE Computer Society Press, Los Alamitos (2006)Google Scholar
  33. 33.
    World Wide Web Consortium (W3C). XQuery Update Facility Requirements (2005), URL: http://www.w3.org/TR/xquery-update-requirements/
  34. 34.
    World Wide Web Consortium (W3C). XML Schema (2006), URL: http://www.w3.org/XML/Schema
  35. 35.
    World Wide Web Consortium (W3C). XQuery Update Facility (2006), URL: http://www.w3.org/TR/2006/WD-xqupdate-20060711/

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Beda Christoph Hammerschmidt
    • 1
  • Christian Werner
    • 3
  • Ylva Brandt
    • 3
  • Volker Linnemann
    • 2
  • Sven Groppe
    • 2
  • Stefan Fischer
    • 3
  1. 1.Oracle Corporation, 400 Oracle Parkway, 4OP408, Redwood Shores, CA 94065USA
  2. 2.Institute of Information Systems, University of LuebeckGermany
  3. 3.Institute of Telematics, University of LuebeckGermany

Personalised recommendations