Skip to main content
Log in

Attribute grammars for scalable query processing on XML streams

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

We introduce the notion of XML Stream Attribute Grammars (XSAGs). XSAGs are the first scalable query language for XML streams (running strictly in linear time with bounded memory consumption independent of the size of the stream) that allows for actual data transformations rather than just document filtering. XSAGs are also relatively easy to use for humans. Moreover, the XSAG formalism provides a strong intuition for which queries can or cannot be processed scalably on streams. We introduce XSAGs together with the necessary language-theoretic machinery, study their theoretical properties such as expressiveness and complexity, and discuss their implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aho, A.V., Sethi, R., Ullman, J.D.: Compilers – Principles, Techniques, and Tools. (Addison-Wesley, 1986)

  2. Aho, A.V., Ullman, J.D.: The Theory of Parsing, Translation, and Compiling. I: Parsing, vol. 1 (Prentice-Hall, 1972)

  3. Alur, R., Madhusudan, P.: Visibly pushdown languages. In: Proc. STOC '04: 36th Annual ACM Symposium on Theory of Computing, pp. 202–211 (2004)

  4. Benedikt, M., Chan, C.Y., Fan, W., Freire, J., Rastogi, R.: Capturing both types and constraints in data integration. In: Proc. SIGMOD 2003, pp. 277–288 (2003)

  5. Benedikt, M., Chan, C.Y., Fan, W., Rastogi, R., Zheng, S., Zhou, A.: DTD-directed publishing with attribute translation grammars. In: Proc. VLDB 2002, pp. 838–849 (2002)

  6. Berlea, A., Seidl, H.: Binary Queries for Document Trees. Nordic J. of Computing, 11(1), 41–71 (2004)

    MathSciNet  MATH  Google Scholar 

  7. Bohannon, P., Buneman, P., Choi, B., Fan, W.: Incremental evaluation of schema-directed XML publishing. In: Proc. SIGMOD 2004, pp. 503–514 (2004)

  8. R., Book, S., Even, S., Greibach, Ott, G.: Ambiguity in graphs and expressions. IEEE Transactions on, Computers, 20(2), 149–153 (1971)

    Google Scholar 

  9. Bray, T., Paoli, J. Sperberg-McQueen, C.M.: Extensible Markup Language (XML) 1.0. Technical report, W3C, (1998)

  10. Brüggemann-Klein, A.: Regular expressions into finite automata. Theoretical Computer Science. 120(2), 197–213 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  11. Brüggemann-Klein, A., Wood, D.: One-unambiguous regular languages. Information and Computation. 142(2), 182–206 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  12. Cimprich, P., O.B., et al.: Streaming Transformations for XML (STX), (2004) Available at http://stx.sourceforge.net

  13. Comon, H., Dauchet, M., Gilleron, R., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M.: Tree Automata Techniques and Applications. (2002) Available at http://www.grappa.univ-lille3.fr/tata/.

  14. Crescenzi, V., Mecca, G.: Grammars have exceptions. Inf. Syst., 23(9), 539–565 (1998)

    Article  Google Scholar 

  15. Downey, R.G., Fellows, M.R.: Parameterized Complexity (Springer, 1999)

  16. Fegaras, L., Levine, D., Bose, S., Chaluvadi, V.: Query Processing of streamed XML data. In: Proc. CIKM 2002, pp. 126–133 (2002)

  17. Gottlob, G., Koch, C., Pichler, R.: Efficient algorithms for processing XPath queries. In: Proc. VLDB 2002, pp. 95–106 (2002)

  18. Green, T.J., Miklau, G., Onizuka, M., Suciu, D.: Processing XML streams with deterministic automata. In: Proc. ICDT'03, pp. 173–189 (2003)

  19. Grohe, M., Koch, C., Schweikardt, N.: Tight lower bounds for query processing on streaming and external memory data. In: Proc. ICALP'05, pp. 1076–1088 (2005)

  20. Gupta, A., Suciu, D.; Stream processing of XPath queries with predicates. In Proc. SIGMOD 2003, pp. 419–430 (2003)

  21. Koch, C.: Efficient processing of expressive node-selecting queries on XML data in secondary storage: A tree automata-based approach. In: Proc. VLDB 2003, pp. 249–260 (2003)

  22. Koch, C.: On the complexity of nonrecursive XQuery and functional query languages on complex values. In: Proc. PODS'05, pp. 84–97 (2005)

  23. Koch, C., Scherzinger, S., Schweikardt, N., Stegmaier, B.: Schema-based scheduling of event processors and buffer minimization for queries on structured data streams. In: Proc. VLDB 2004, pp. 228–239 (2004)

  24. Lee, D., Mani, M., Murata, M.: Reasoning about XML schema languages using formal language theory. Technical Report RJ 10197 Log 95071, IBM Research (2000)

  25. Ludäscher, B., Mukhopadhyay, P., Papakonstantinou, Y.: A transducer-based XML query processor. In: Proc. VLDB 2002, pp. 227–238 (2002)

  26. Murata, M., Lee, D., Kawaguchi, M.M.K.: Taxonomy of XML schema languages using formal language theory. ACM Transactions of Internet Technology, 2005. forthcoming.

  27. Neven, F.: Extensions of attribute grammars for structured document queries. In: Proc. DBPL 1999, pp. 99–116 (1999)

  28. Neven, F., van den Bussche, J.: Expressiveness of structured document query languages based on attribute grammars. Journal of the ACM, 49(1), 56–100 (2002)

    Article  MathSciNet  Google Scholar 

  29. Olteanu, D., Furche, T., Bry, F.: Evaluating complex queries against XML streams with polynomial combined complexity. In: Proc. BNCOD 2004, pp. 31–44 (July 2004)

  30. Peng, F., Chawathe, S.S.: XPath queries on streaming data. In: Proc. SIGMOD 2003, pp. 431–442 (2003)

  31. Pitcher, C.: Visibly pushdown expression effects for XML stream processing. In: Proc. PLANX (2005)

  32. Scherzinger, S.: Scalable Query Processing on XML streams. Diploma thesis, University of Passau, Germany, (2004) Available online at http://www.infosys.uni-sb.de/~scherzin/thesis.pdf.

  33. van der Steen, G.: A canonical query language and its efficient implementation. In XML Europe 2000 Conference Proceedings, pp. 543–548 (2000)

  34. World Wide Web Consortium. XML Query (XQuery). http://www.w3c.org/XML/query/.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefanie Scherzinger.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koch, C., Scherzinger, S. Attribute grammars for scalable query processing on XML streams. The VLDB Journal 16, 317–342 (2007). https://doi.org/10.1007/s00778-005-0169-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-005-0169-1

Keywords

Navigation