Hash-Based Structural Join Algorithms

  • Christian Mathis
  • Theo Härder
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4254)


Algorithms for processing Structural Joins embody essential building blocks for XML query evaluation. Their design is a difficult task, because they have to satisfy many requirements, e. g., guarantee linear worst-case runtime; generate sorted, duplicate-free output; adapt to fiercely varying input sizes and element distributions; enable pipelining; and (probably) more. Therefore, it is not possible to design the structural join algorithm. Rather, the provision of different specialized operators, from which the query optimizer can choose, is beneficial for query efficiency. We propose new hash-based structural joins that can process unordered input sequences possibly containing duplicates. We also show that these algorithms can substantially reduce the number of sort operations on intermediate results for (complex) tree structured queries (twigs).


Input Sequence Hash Table Query Evaluation Query Optimizer XPath Expression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Al-Khalifa, S., et al.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: Proc. ICDE, pp. 141–152 (2002)Google Scholar
  2. 2.
    Böhme, T., Rahm, E.: Supporting Efficient Streaming and Insertion of XML Data in RDBMS. In: Proc. 3rd DIWeb Workshop, pp. 70–81 (2004)Google Scholar
  3. 3.
    Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: Proc. SIGMOD, pp. 310–321 (2002)Google Scholar
  4. 4.
    Chien, S.-Y., Vagena, Z., Zhang, D., Tsotras, V.J., Zaniolo, C.: Efficient Structural Joins on Indexed XML Documents. In: Proc. VLDB, pp. 263–274 (2002)Google Scholar
  5. 5.
    Dewey, M.: Dewey Decimal Classification System, http://www.mtsu.edu/~vvesper/dewey.html
  6. 6.
    Fernández, M., Hidders, J., Michiels, P., Siméon, J., Vercammen, R.: Optimizing Sorting and Duplicate Elimination in XQuery Path Expressions. In: Andersen, K.V., Debenham, J., Wagner, R. (eds.) DEXA 2005. LNCS, vol. 3588, pp. 554–563. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  7. 7.
    Fontoura, M., Josifovski, V., Shekita, E., Yang, B.: Optimizing Cursor Movement in Holistic Twig Joins. In: Proc. 14th CIKM, pp. 784–791 (2005)Google Scholar
  8. 8.
    Gottlob, G., Koch, C., Pichler, R.: Efficient algorithms for processing XPath queries. ACM Trans. Database Syst. 30(2), 444–491 (2005)CrossRefMathSciNetGoogle Scholar
  9. 9.
    Härder, T., Haustein, M., Mathis, C., Wagner, M.: Node Labeling Schemes for Dynamic XML Documents Reconsidered. Data & Knowledge Engineering (accepted, 2006)Google Scholar
  10. 10.
    Li, Q., Moon, B.: Indexing and Querying XML Data for Regular Path Expressions. In: Proc. VLDB, pp. 361–370 (2001)Google Scholar
  11. 11.
    Li, Q., Moon, B.: Partition Based Path Join Algorithms for XML Data. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 160–170. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  12. 12.
    O’Neil, P.E., O’Neil, E.J., Pal, S., Cseri, I., Schaller, G., Westbury, N.: ORDPATHs: Insert-Friendly XML Node Labels. In: Proc. SIGMOD, pp. 903–908 (2004)Google Scholar
  13. 13.
    Tatarinov, I., et al.: Storing and Querying Ordered XML Using a Relational Database System. In: Proc. SIGMOD, pp. 204–215 (2002)Google Scholar
  14. 14.
    Vagena, Z., Moro, M.M., Tsotras, V.J.: Efficient Processing of XML Containment Queries using Partition-Based Schemes. In: Proc. IDEAS, pp. 161–170 (2004)Google Scholar
  15. 15.
    Wu, Y., Patel, J.M., Jagadish, H.V.: Structural Join Order Selection for XML Query Optimization. In: Proc. ICDE, pp. 443–454 (2003)Google Scholar
  16. 16.
    Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohmann, G.: On Supporting Containment Queries in Relational Database Management Systems. In: Proc. SIGMOD, pp. 425–436 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Christian Mathis
    • 1
  • Theo Härder
    • 1
  1. 1.Database and Information SystemsUniversity of KaiserslauternKaiserslauternGermany

Personalised recommendations