Abstract
Algorithms for processing Structural Joins embody essential building blocks for XML query evaluation. Their design is a difficult task, because they have to satisfy many requirements, e. g., guarantee linear worst-case runtime; generate sorted, duplicate-free output; adapt to fiercely varying input sizes and element distributions; enable pipelining; and (probably) more. Therefore, it is not possible to design the structural join algorithm. Rather, the provision of different specialized operators, from which the query optimizer can choose, is beneficial for query efficiency. We propose new hash-based structural joins that can process unordered input sequences possibly containing duplicates. We also show that these algorithms can substantially reduce the number of sort operations on intermediate results for (complex) tree structured queries (twigs).
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Al-Khalifa, S., et al.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: Proc. ICDE, pp. 141–152 (2002)
Böhme, T., Rahm, E.: Supporting Efficient Streaming and Insertion of XML Data in RDBMS. In: Proc. 3rd DIWeb Workshop, pp. 70–81 (2004)
Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: Proc. SIGMOD, pp. 310–321 (2002)
Chien, S.-Y., Vagena, Z., Zhang, D., Tsotras, V.J., Zaniolo, C.: Efficient Structural Joins on Indexed XML Documents. In: Proc. VLDB, pp. 263–274 (2002)
Dewey, M.: Dewey Decimal Classification System, http://www.mtsu.edu/~vvesper/dewey.html
Fernández, M., Hidders, J., Michiels, P., Siméon, J., Vercammen, R.: Optimizing Sorting and Duplicate Elimination in XQuery Path Expressions. In: Andersen, K.V., Debenham, J., Wagner, R. (eds.) DEXA 2005. LNCS, vol. 3588, pp. 554–563. Springer, Heidelberg (2005)
Fontoura, M., Josifovski, V., Shekita, E., Yang, B.: Optimizing Cursor Movement in Holistic Twig Joins. In: Proc. 14th CIKM, pp. 784–791 (2005)
Gottlob, G., Koch, C., Pichler, R.: Efficient algorithms for processing XPath queries. ACM Trans. Database Syst. 30(2), 444–491 (2005)
Härder, T., Haustein, M., Mathis, C., Wagner, M.: Node Labeling Schemes for Dynamic XML Documents Reconsidered. Data & Knowledge Engineering (accepted, 2006)
Li, Q., Moon, B.: Indexing and Querying XML Data for Regular Path Expressions. In: Proc. VLDB, pp. 361–370 (2001)
Li, Q., Moon, B.: Partition Based Path Join Algorithms for XML Data. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 160–170. Springer, Heidelberg (2003)
O’Neil, P.E., O’Neil, E.J., Pal, S., Cseri, I., Schaller, G., Westbury, N.: ORDPATHs: Insert-Friendly XML Node Labels. In: Proc. SIGMOD, pp. 903–908 (2004)
Tatarinov, I., et al.: Storing and Querying Ordered XML Using a Relational Database System. In: Proc. SIGMOD, pp. 204–215 (2002)
Vagena, Z., Moro, M.M., Tsotras, V.J.: Efficient Processing of XML Containment Queries using Partition-Based Schemes. In: Proc. IDEAS, pp. 161–170 (2004)
Wu, Y., Patel, J.M., Jagadish, H.V.: Structural Join Order Selection for XML Query Optimization. In: Proc. ICDE, pp. 443–454 (2003)
Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohmann, G.: On Supporting Containment Queries in Relational Database Management Systems. In: Proc. SIGMOD, pp. 425–436 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mathis, C., Härder, T. (2006). Hash-Based Structural Join Algorithms. In: Grust, T., et al. Current Trends in Database Technology – EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 4254. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11896548_14
Download citation
DOI: https://doi.org/10.1007/11896548_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46788-5
Online ISBN: 978-3-540-46790-8
eBook Packages: Computer ScienceComputer Science (R0)