Efficient Evaluation of Generalized Tree-Pattern Queries with Same-Path Constraints

  • Xiaoying Wu
  • Dimitri Theodoratos
  • Stefanos Souldatos
  • Theodore Dalamagas
  • Timos Sellis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5566)

Abstract

Querying XML data is based on the specification of structural patterns which in practice are formulated using XPath. Usually, these structural patterns are in the form of trees (Tree-Pattern Queries – TPQs). Requirements for flexible querying of XML data including XML data from scientific applications have motivated recently the introduction of query languages that are more general and flexible than TPQs. These query languages correspond to a fragment of XPath larger than TPQs for which efficient non-main-memory evaluation algorithms are not known.

In this paper, we consider a query language, called Partial Tree-Pattern Query (PTPQ) language, which generalizes and strictly contains TPQs. PTPQs represent a broad fragment of XPath which is very useful in practice. We show how PTPQs can be represented as directed acyclic graphs augmented with “same-path” constraints. We develop an original polynomial time holistic algorithm for PTPQs under the inverted list evaluation model. To the best of our knowledge, this is the first algorithm to support the evaluation of such a broad structural fragment of XPath. We provide a theoretical analysis of our algorithm and identify cases where it is asymptotically optimal. In order to assess its performance, we design two other techniques that evaluate PTPQs by exploiting the state-of-the-art existing algorithms for smaller classes of queries. An extensive experimental evaluation shows that our holistic algorithm outperforms the other ones.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    World Wide Web Consortium site, W3C, http://www.w3.org/
  2. 2.
    Yu, C., Jagadish, H.V.: Querying complex structured databases. In: VLDB (2007)Google Scholar
  3. 3.
    Li, Y., Yu, C., Jagadish, H.V.: Schema-Free XQuery. In: VLDB (2004)Google Scholar
  4. 4.
    Theodoratos, D., Dalamagas, T., Koufopoulos, A., Gehani, N.: Semantic querying of tree-structured data sources using partially specified tree patterns. In: CIKM (2005)Google Scholar
  5. 5.
    Theodoratos, D., Wu, X.: Assigning semantics to partial tree-pattern queries. Data Knowl. Eng. (2007)Google Scholar
  6. 6.
    Peery, C., Wang, W., Marian, A., Nguyen, T.D.: Multi-dimensional search for personal information management systems. In: EDBT (2008)Google Scholar
  7. 7.
    Hristidis, V., Papakonstantinou, Y., Balmin, A.: Keyword proximity search on XML graphs. In: ICDE (2003)Google Scholar
  8. 8.
    Amer-Yahia, S., Lakshmanan, L.V.S., Pandit, S.: Flexpath: Flexible structure and full-text querying for xml. In: SIGMOD (2004)Google Scholar
  9. 9.
    Al-Khalifa, S., Jagadish, H.V., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural joins: A primitive for efficient XML query pattern matching. In: ICDE (2002)Google Scholar
  10. 10.
    Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: SIGMOD (2002)Google Scholar
  11. 11.
    Jiang, H., Wang, W., Lu, H., Yu, J.X.: Holistic twig joins on indexed XML documents. In: VLDB. (2003)Google Scholar
  12. 12.
    Wu, Y., Patel, J.M., Jagadish, H.V.: Structural join order selection for XML query optimization. In: ICDE (2003)Google Scholar
  13. 13.
    Lu, J., Chen, T., Ling, T.W.: Efficient processing of XML twig patterns with parent child edges: a look-ahead approach. In: CIKM (2004)Google Scholar
  14. 14.
    Chen, T., Lu, J., Ling, T.W.: On boosting holism in XML twig pattern matching using structural indexing techniques. In: SIGMOD (2005)Google Scholar
  15. 15.
    Jiang, H., Lu, H., Wang, W.: Efficient processing of XML twig queries with or-predicates. In: SIGMOD (2004)Google Scholar
  16. 16.
    Chen, L., Gupta, A., Kurul, M.E.: Stack-based algorithms for pattern matching on DAGs. In: VLDB (2005)Google Scholar
  17. 17.
    Gottlob, G., Koch, C., Pichler, R.: Efficient algorithms for processing XPath queries. ACM Trans. Database Syst. (2005)Google Scholar
  18. 18.
    Theodoratos, D., Placek, P., Dalamagas, T., Souldatos, S., Sellis, T.: Containment of partially specified tree-pattern queries in the presence of dimension graphs. VLDB Journal (2008)Google Scholar
  19. 19.
    Souldatos, S., Wu, X., Theodoratos, D., Dalamagas, T., Sellis, T.: Evaluation of partial path queries on XML data. In: CIKM (2007)Google Scholar
  20. 20.
    Wu, X., Souldatos, S., Theodoratos, D., Dalamagas, T., Sellis, T.: Efficient evaluation of generalized path pattern queries on XML data. In: WWW (2008)Google Scholar
  21. 21.
    Bar-Yossef, Z., Fontoura, M., Josifovski, V.: On the memory requirements of XPath evaluation over XML streams. In: PODS (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Xiaoying Wu
    • 1
  • Dimitri Theodoratos
    • 1
  • Stefanos Souldatos
    • 2
  • Theodore Dalamagas
    • 3
  • Timos Sellis
    • 2
    • 3
  1. 1.New Jersey Institute of TechnologyUSA
  2. 2.National Technical University of AthensGreece
  3. 3.Institute for the Management of Information SystemsGreece

Personalised recommendations