Abstract
Querying XML has been the subject of much recent investigation. A formal bulk algebra is essential for applying database-style optimization to XML queries. We develop such an algebra, called TAX (Tree Algebra for XML), for manipulating XML data, modeled as forests of labeled ordered trees. Motivated both by aesthetic considerations of intuitiveness, and by efficient computability and amenability to optimization, we develop TAX as a natural extension of relational algebra, with a small set of operators. TAX is complete for relational algebra extended with aggregation, and can express most queries expressible in popular XML query languages. It forms the basis for the Timber XML database system currently under development by us.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. Wiener. The Lorel query language for semistructured data. Journal on Digital Libraries, 1(1), 1996.
S. Al-Khalifa, H. V. Jagadish, N. Koudas, J. M. Patel, D. Srivastava, and Y. Wu. Structural joins: Efficient matching of XML query patterns. In Proc. ICDE, 2002.
D. Beech, A. Malhotra, and M. Rys. A formal data model and algebra for XML. W3C XML Query Working Group Note, Sep. 1999.
C. Beeri and Y. Tzaban. SAL: An algebra for Semi-Structured Data and XML. In Proc. SIGMOD WebDB workshop, June 1999.
P. Buneman, S. Davidson, G. Hillebrand, and D. Suciu. A query language and optimization techniques for unstructured data. In Proc. SIGMOD, June 1996.
D. Chamberlin, J. Robie, and D. Florescu. Quilt: An XML query language for heterogeneous data sources. In Proc. SIGMOD WebDB workshop, May 2000.
D. Chamberlin, D. Florescu, J. Robie, J. Simeon, and M. Stefanescu. XQuery: A query language for XML. W3C Working Draft. 15 Feb. 2001.
E. F. Codd. A relational model of data for large shared data banks. CACM 13(6), pages 377–387, 1970.
V. Christophides, S. Cluet, and J. Simeon. On wrapping query languages and efficient XML integration. In Proc. SIGMOD, pages 141–152, 2000.
M. P. Consens and A. O. Mendelzon. Graphlog: A visual formalism for real life recursion. In Proc. PODS, Apr. 1990.
A. Deutsch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu. A query language for XML. In Proc. WWW, 1999.
M. Fernandez, J. Simeon, and P. Wadler. An algebra for XML query. In Proc. FSTTCS, Delhi, December 2000.
M. Gyssens, J. Paredaens, and D. Van Gucht. A grammar-based approach towards unifying hierarchical data models. In Proc. SIGMOD, pages 263–272, 1989.
C. M. Hoffmann and M. J. O’Donnell. Pattern-matching in trees. JA CM Vol. 29, pages 68–95, 1982.
H. Hosoya and B. C. Pierce. XDuce: A Typed XML Processing Language. In Proc. SIGMOD WebDB workshop, May 2000.
B. Ludascher, Y. Papakonstantinou, and P. Velikhov. Navigation-driven evaluation of virtual mediated views. In Proc. EDBT, pp. 150–165, 2000.
Y. Papakonstantinou and V. Vianu. DTD inference for views of XML data. In Proc. PODS, 2000.
J. Paradaens, J. Van den Bussche, D. Van Gucht, et al. An Overview of GOOD ACM SIGMOD Record, March 1992.
J. Shanmugasundaram, H. Gang, K. Tufte, C. Zhang, D. DeWitt, and J. Naughton. Relational databases for querying XML documents: Limitations and opportunities. In Proc. VLDB, 1999.
B. Subramanian, T. Leung, S. Vandenberg, S. Zdonik. The AQUA approach to querying lists and trees in object-oriented databases. In Proc. ICDE, 1995.
I. Tatarinov, Z. G. Ives, A. Y. Halevy, and D. S. Weld. Updating XML. In Proc. SIGMOD, 2001.
World Wide Web Consortium. The document object model. http://www.w3.org/DOM/
C. Zhang, J. Naughton, D. De Witt, Q. Luo, and G. Lohman. On supporting containment queries in relational database management systems. In Proc. SIGMOD, May 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jagadish, H.V., Lakshmanan, L.V.S., Srivastava, D., Thompson, K. (2002). TAX: A Tree Algebra for XML. In: Ghelli, G., Grahne, G. (eds) Database Programming Languages. DBPL 2001. Lecture Notes in Computer Science, vol 2397. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46093-4_9
Download citation
DOI: https://doi.org/10.1007/3-540-46093-4_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44080-2
Online ISBN: 978-3-540-46093-0
eBook Packages: Springer Book Archive