Abstract
We introduce the notion of XML Stream Attribute Grammars (XSAGs). XSAGs are the first scalable query language for XML streams (running strictly in linear time with bounded memory consumption independent of the size of the stream) that allows for actual data transformations rather than just document filtering. XSAGs are also relatively easy to use for humans. Moreover, the XSAG formalism provides a strong intuition for which queries can or cannot be processed scalably on streams. We introduce XSAGs together with the necessary language-theoretic machinery, study their theoretical properties such as expressiveness and complexity, and discuss their implementation.
Similar content being viewed by others
References
Aho, A.V., Sethi, R., Ullman, J.D.: Compilers – Principles, Techniques, and Tools. (Addison-Wesley, 1986)
Aho, A.V., Ullman, J.D.: The Theory of Parsing, Translation, and Compiling. I: Parsing, vol. 1 (Prentice-Hall, 1972)
Alur, R., Madhusudan, P.: Visibly pushdown languages. In: Proc. STOC '04: 36th Annual ACM Symposium on Theory of Computing, pp. 202–211 (2004)
Benedikt, M., Chan, C.Y., Fan, W., Freire, J., Rastogi, R.: Capturing both types and constraints in data integration. In: Proc. SIGMOD 2003, pp. 277–288 (2003)
Benedikt, M., Chan, C.Y., Fan, W., Rastogi, R., Zheng, S., Zhou, A.: DTD-directed publishing with attribute translation grammars. In: Proc. VLDB 2002, pp. 838–849 (2002)
Berlea, A., Seidl, H.: Binary Queries for Document Trees. Nordic J. of Computing, 11(1), 41–71 (2004)
Bohannon, P., Buneman, P., Choi, B., Fan, W.: Incremental evaluation of schema-directed XML publishing. In: Proc. SIGMOD 2004, pp. 503–514 (2004)
R., Book, S., Even, S., Greibach, Ott, G.: Ambiguity in graphs and expressions. IEEE Transactions on, Computers, 20(2), 149–153 (1971)
Bray, T., Paoli, J. Sperberg-McQueen, C.M.: Extensible Markup Language (XML) 1.0. Technical report, W3C, (1998)
Brüggemann-Klein, A.: Regular expressions into finite automata. Theoretical Computer Science. 120(2), 197–213 (1993)
Brüggemann-Klein, A., Wood, D.: One-unambiguous regular languages. Information and Computation. 142(2), 182–206 (1998)
Cimprich, P., O.B., et al.: Streaming Transformations for XML (STX), (2004) Available at http://stx.sourceforge.net
Comon, H., Dauchet, M., Gilleron, R., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M.: Tree Automata Techniques and Applications. (2002) Available at http://www.grappa.univ-lille3.fr/tata/.
Crescenzi, V., Mecca, G.: Grammars have exceptions. Inf. Syst., 23(9), 539–565 (1998)
Downey, R.G., Fellows, M.R.: Parameterized Complexity (Springer, 1999)
Fegaras, L., Levine, D., Bose, S., Chaluvadi, V.: Query Processing of streamed XML data. In: Proc. CIKM 2002, pp. 126–133 (2002)
Gottlob, G., Koch, C., Pichler, R.: Efficient algorithms for processing XPath queries. In: Proc. VLDB 2002, pp. 95–106 (2002)
Green, T.J., Miklau, G., Onizuka, M., Suciu, D.: Processing XML streams with deterministic automata. In: Proc. ICDT'03, pp. 173–189 (2003)
Grohe, M., Koch, C., Schweikardt, N.: Tight lower bounds for query processing on streaming and external memory data. In: Proc. ICALP'05, pp. 1076–1088 (2005)
Gupta, A., Suciu, D.; Stream processing of XPath queries with predicates. In Proc. SIGMOD 2003, pp. 419–430 (2003)
Koch, C.: Efficient processing of expressive node-selecting queries on XML data in secondary storage: A tree automata-based approach. In: Proc. VLDB 2003, pp. 249–260 (2003)
Koch, C.: On the complexity of nonrecursive XQuery and functional query languages on complex values. In: Proc. PODS'05, pp. 84–97 (2005)
Koch, C., Scherzinger, S., Schweikardt, N., Stegmaier, B.: Schema-based scheduling of event processors and buffer minimization for queries on structured data streams. In: Proc. VLDB 2004, pp. 228–239 (2004)
Lee, D., Mani, M., Murata, M.: Reasoning about XML schema languages using formal language theory. Technical Report RJ 10197 Log 95071, IBM Research (2000)
Ludäscher, B., Mukhopadhyay, P., Papakonstantinou, Y.: A transducer-based XML query processor. In: Proc. VLDB 2002, pp. 227–238 (2002)
Murata, M., Lee, D., Kawaguchi, M.M.K.: Taxonomy of XML schema languages using formal language theory. ACM Transactions of Internet Technology, 2005. forthcoming.
Neven, F.: Extensions of attribute grammars for structured document queries. In: Proc. DBPL 1999, pp. 99–116 (1999)
Neven, F., van den Bussche, J.: Expressiveness of structured document query languages based on attribute grammars. Journal of the ACM, 49(1), 56–100 (2002)
Olteanu, D., Furche, T., Bry, F.: Evaluating complex queries against XML streams with polynomial combined complexity. In: Proc. BNCOD 2004, pp. 31–44 (July 2004)
Peng, F., Chawathe, S.S.: XPath queries on streaming data. In: Proc. SIGMOD 2003, pp. 431–442 (2003)
Pitcher, C.: Visibly pushdown expression effects for XML stream processing. In: Proc. PLANX (2005)
Scherzinger, S.: Scalable Query Processing on XML streams. Diploma thesis, University of Passau, Germany, (2004) Available online at http://www.infosys.uni-sb.de/~scherzin/thesis.pdf.
van der Steen, G.: A canonical query language and its efficient implementation. In XML Europe 2000 Conference Proceedings, pp. 543–548 (2000)
World Wide Web Consortium. XML Query (XQuery). http://www.w3c.org/XML/query/.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Koch, C., Scherzinger, S. Attribute grammars for scalable query processing on XML streams. The VLDB Journal 16, 317–342 (2007). https://doi.org/10.1007/s00778-005-0169-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-005-0169-1