TAX: A Tree Algebra for XML

  • H. V. Jagadish
  • Laks V. S. Lakshmanan
  • Divesh Srivastava
  • Keith Thompson
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2397)


Querying XML has been the subject of much recent investigation. A formal bulk algebra is essential for applying database-style optimization to XML queries. We develop such an algebra, called TAX (Tree Algebra for XML), for manipulating XML data, modeled as forests of labeled ordered trees. Motivated both by aesthetic considerations of intuitiveness, and by efficient computability and amenability to optimization, we develop TAX as a natural extension of relational algebra, with a small set of operators. TAX is complete for relational algebra extended with aggregation, and can express most queries expressible in popular XML query languages. It forms the basis for the Timber XML database system currently under development by us.


Data Tree Pattern Tree Query Language Relational Algebra Input Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. Wiener. The Lorel query language for semistructured data. Journal on Digital Libraries, 1(1), 1996.Google Scholar
  2. 2.
    S. Al-Khalifa, H. V. Jagadish, N. Koudas, J. M. Patel, D. Srivastava, and Y. Wu. Structural joins: Efficient matching of XML query patterns. In Proc. ICDE, 2002.Google Scholar
  3. 3.
    D. Beech, A. Malhotra, and M. Rys. A formal data model and algebra for XML. W3C XML Query Working Group Note, Sep. 1999.Google Scholar
  4. 4.
    C. Beeri and Y. Tzaban. SAL: An algebra for Semi-Structured Data and XML. In Proc. SIGMOD WebDB workshop, June 1999.Google Scholar
  5. 5.
    P. Buneman, S. Davidson, G. Hillebrand, and D. Suciu. A query language and optimization techniques for unstructured data. In Proc. SIGMOD, June 1996.Google Scholar
  6. 6.
    D. Chamberlin, J. Robie, and D. Florescu. Quilt: An XML query language for heterogeneous data sources. In Proc. SIGMOD WebDB workshop, May 2000.Google Scholar
  7. 7.
    D. Chamberlin, D. Florescu, J. Robie, J. Simeon, and M. Stefanescu. XQuery: A query language for XML. W3C Working Draft. 15 Feb. 2001.Google Scholar
  8. 8.
    E. F. Codd. A relational model of data for large shared data banks. CACM 13(6), pages 377–387, 1970.zbMATHGoogle Scholar
  9. 9.
    V. Christophides, S. Cluet, and J. Simeon. On wrapping query languages and efficient XML integration. In Proc. SIGMOD, pages 141–152, 2000.Google Scholar
  10. 10.
    M. P. Consens and A. O. Mendelzon. Graphlog: A visual formalism for real life recursion. In Proc. PODS, Apr. 1990.Google Scholar
  11. 11.
    A. Deutsch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu. A query language for XML. In Proc. WWW, 1999.Google Scholar
  12. 12.
    M. Fernandez, J. Simeon, and P. Wadler. An algebra for XML query. In Proc. FSTTCS, Delhi, December 2000.Google Scholar
  13. 13.
    M. Gyssens, J. Paredaens, and D. Van Gucht. A grammar-based approach towards unifying hierarchical data models. In Proc. SIGMOD, pages 263–272, 1989.Google Scholar
  14. 14.
    C. M. Hoffmann and M. J. O’Donnell. Pattern-matching in trees. JA CM Vol. 29, pages 68–95, 1982.zbMATHMathSciNetGoogle Scholar
  15. 15.
    H. Hosoya and B. C. Pierce. XDuce: A Typed XML Processing Language. In Proc. SIGMOD WebDB workshop, May 2000.Google Scholar
  16. 16.
    B. Ludascher, Y. Papakonstantinou, and P. Velikhov. Navigation-driven evaluation of virtual mediated views. In Proc. EDBT, pp. 150–165, 2000.Google Scholar
  17. 17.
    Y. Papakonstantinou and V. Vianu. DTD inference for views of XML data. In Proc. PODS, 2000.Google Scholar
  18. 18.
    J. Paradaens, J. Van den Bussche, D. Van Gucht, et al. An Overview of GOOD ACM SIGMOD Record, March 1992.Google Scholar
  19. 19.
    J. Shanmugasundaram, H. Gang, K. Tufte, C. Zhang, D. DeWitt, and J. Naughton. Relational databases for querying XML documents: Limitations and opportunities. In Proc. VLDB, 1999.Google Scholar
  20. 20.
    B. Subramanian, T. Leung, S. Vandenberg, S. Zdonik. The AQUA approach to querying lists and trees in object-oriented databases. In Proc. ICDE, 1995.Google Scholar
  21. 21.
    I. Tatarinov, Z. G. Ives, A. Y. Halevy, and D. S. Weld. Updating XML. In Proc. SIGMOD, 2001.Google Scholar
  22. 22.
    World Wide Web Consortium. The document object model.
  23. 23.
    C. Zhang, J. Naughton, D. De Witt, Q. Luo, and G. Lohman. On supporting containment queries in relational database management systems. In Proc. SIGMOD, May 2001.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • H. V. Jagadish
    • 1
  • Laks V. S. Lakshmanan
    • 2
  • Divesh Srivastava
    • 3
  • Keith Thompson
    • 1
  1. 1.University of MichiganAnn ArborUSA
  2. 2.University of British ColumbiaVancouverCanada
  3. 3.AT&T Labs-ResearchFlorham ParkUSA

Personalised recommendations