Abstract
Over the recent years, very little effort has been made to give XPath a proper algebraic treatment. One laudable exception is the Natix Algebra (NAL) which defines the translation of XPath queries into algebraic expressions in a concise way, thereby enabling algebraic optimizations. However, NAL does not capture various promising core XML query evaluation algorithms like, for example, the Holistic Twig Join. By integrating a logical structural join operator, we enable NAL to be compiled into a physical algebra containing exactly those missing physical operators. We will provide several important query unnesting rules and demonstrate the effectivity of our approach by an implementation in the XML Transaction Coordinator (XTC) – our prototype of a native XML database system.
Zusammenfassung
In der Literatur der vergangenen Jahre wurde der algebraischen Behandlung der XML-Anfragesprache XPath wenig Bedeutung zugemessen. Eine löbliche Ausnahme bildet die Natix-Algebra (NAL), welche auf präzise Weise die Übersetzung einer XPath-Anfrage in einen algebraischen Ausdruck definiert, und somit die Tür zur algebraischen Optimierung dieser Anfragesprache öffnet. Bei genauerer Betrachtung verpasst es NAL jedoch, bekannte und vielversprechende Auswertungsalgorithmen, wie zum Beispiel den „Holistic Twig Join“, in den Übersetzungsprozess einzubeziehen. Die in diesem Artikel vorgeschlagene Einführung eines logischen strukturellen Verbundes („Structural Join“) behebt diese Schwachstelle und erlaubt es, einen NAL-Ausdruck in eine physische Algebra zu übersetzen, die genau diese fehlenden Auswertungsalgorithmen enthält. Zusätzlich werden wichtige Regeln zur Entschachtelung von XPath-Anfragen eingeführt. Mit Hilfe des „XML Transaction Coodinator“ (XTC) – unserem Protoyp eines nativen XML-Datenbanksystems – werden die zu erwartenden Effizienzsteigerungen nachgewiesen.
Similar content being viewed by others
References
Al-Khalifa S, Jagadish HV, Patel JM, Wu Y, Koudas N, Srivastava D (2002) Structural Joins: A Primitive for Efficient XML Query Pattern Matching. Proc ICDE: 141–152
Böhme T, Rahm E (2004) Supporting Efficient Streaming and Insertion of XML Data in RDBMS. Proc 3rd DIWeb Workshop: 70–81
Boncz PA, Grust T, van Keulen M, Manegold S, Rittinger J, Teubner J (2006) MonetDB/XQuery: a fast XQuery processor powered by a relational engine. Proc SIGMOD: 479–490
Brantner M, Kanne CC, Helmer S, Moerkotte G (2005) Full-fledged Algebraic XPath Processing in Natix. Proc ICDE: 705–716
Brantner M, Kanne CC, Helmer S, Moerkotte G (2006) Algebraic Optimization of Nested XPath Expressions. Proc ICDE: 128
Bruno N, Koudas N, Srivastava D (2002) Holistic twig joins: Optimal XML pattern matching. Proc SIGMOD: 310–321
Chen Q, Lim A, Ong KW (2003) D(k)-Index: An Adaptive Structural Summary for Graph-Structured Data. Proc SIGMOD: 134–144
Chien SY, Vagena Z, Zhang D, Tsotras VJ, Zaniolo C (2002) Efficient Structural Joins on Indexed XML Documents. Proc VLDB: 263–274
W3C Recommendation: XQuery 1.0 and XPath 2.0 Formal Semantics. W3C Specification. http://www.w3.org/TR/xquery-semantics/
Fernandez MF, Hidders J, Michiels P, Simeon J, Vercammen R (2005) Optimizing Sorting and Duplicate Elimination. Proc DEXA: 554–563
Fontoura M, Josifovski V, Shekita E, Yang B (2005) Optimizing Cursor Movement in Holistic Twig Joins. Proc 14th CIKM: 784–791
Grust T, van Keulen M, Teubner J (2003) Staircase Join: Teach a Relational DBMS to Watch its Axis Steps. Proc VLDB: 524–535
Härder T, Haustein M, Mathis C, Wagner M (2007) Node Labeling Schemes for Dynamic XML Documents Reconsidered. Data Knowl Eng 60:126–149
Hidders J, Michiels P, Siméon J, Vercammen R (2006) How to recognize different kinds of tree patterns from quite a long way away. Technical Report TR UA 13-2006, Univ. of Antwerp and IBM Research
Li HG, Aghili SA, El Abbadi A (2006) FLUX: Content and Structure Matching of XPath Queries with Range Predicates. Proc XSym: 61–76
Jagadish H, Lakshmanan L, Srivastava D, Thompson K (2001) TAX: A Tree Algebra for XML. Proc DBPL: 149–164
Paparizos S, Wu Y, Lakshmanan LVS, Jagadish HV (2004) Tree Logical Classes for Efficient Evaluation of XQuery. Proc SIGMOD: 71–82
Mathis C, Härder T, Haustein M (2006) Locking-Aware Structural Join Operators for XML Query Processing. Proc SIGMOD: 467–478
Mathis C, Härder T (2006) Hash-Based Structural Join Algorithms. Proc DATAX’06, LNCS 4254, Springer-Verlag, 136–149
Mathis C (2007) Integrating Structural Joins into a Tuple-Based XPath Algebra. Proc BTW: 242–261
May N, Helmer S, Moerkotte G (2004) Nested Queries and Quantifiers in an Ordered Context. Proc ICDE: 239–250
May N, Brantner M, Böhm A, Kanne CC, Moerkotte G (2006) Index vs. Navigation in XPath Evaluation. Proc XSym: 16–30
Michiels P, Mihaila GA, Siméon J (2007) Put a tree pattern in your algebra. Proc ICDE
O’Neil PE, O’Neil EJ, Pal S, Cseri I, Schaller G, Westbury N (2004) ORDPATHs: Insert-friendly XML node labels. Proc SIGMOD: 903–908
Re C, Siméon J, Fernández M (2006) A Complete and Efficient Algebraic Compiler for XQuery. Proc ICDE: 14
Schmidt A, Waas F, Kersten M, Carey MJ, Manolescu I, Busse R(2002) XMark: A Benchmark for XML Data Management. Proc VLDB: 974–985
Wu Y, Patel JM, Jagadish HV (2003) Structural Join Order Selection for XML Query Optimization. Proc ICDE: 443–454
W3C Recommendation: XML Path Language (XPath), Version 1.0 (1999). http://www.w3.org/TR/xpath
Zografoula V, Koudas N, Srivastava D, Tsotras VJ (2005) Efficient Handling of Positional Predicates within XML Query Processing. Proc XSym: 68–83
Author information
Authors and Affiliations
Corresponding author
Additional information
This work has been supported by the Rheinland-Pfalz cluster of excellence ‘‘Dependable Adaptive Systems and Mathematical Modeling’’ (see http://www.dasmod.de). The article at hand is an extended version of [20].
CR subject classification
H.2.4; D.3.4
Rights and permissions
About this article
Cite this article
Mathis, C. Extending a tuple-based XPath algebra to enhance evaluation flexibility . Informatik Forsch. Entw. 21, 147–164 (2007). https://doi.org/10.1007/s00450-007-0023-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00450-007-0023-3