Advertisement

The VLDB Journal

, Volume 21, Issue 6, pp 889–914 | Cite as

ANDES: efficient evaluation of NOT-twig queries in relational databases

  • Kheng Hong Soh
  • Ba Quan Truong
  • Sourav S. Bhowmick
Regular Paper

Abstract

Despite a large body of work on XPath query processing in relational environment, systematic study of queries containing not-predicates have received little attention in the literature. Particularly, several xml supports of industrial-strength commercial rdbms fail to efficiently evaluate such queries. In this paper, we present an efficient and novel strategy to evaluate not -twig queries in a tree-unaware relational environment. not -twig queries are XPath queries with ancestor–descendant and parent–child axis and contain one or more not-predicates. We propose a novel Dewey-based encoding scheme called Andes (ANcestor Dewey-based Encoding Scheme), which enables us to efficiently filter out elements satisfying a not-predicate by comparing their ancestor group identifiers. In this approach, a set of elements under the same common ancestor at a specific level in the xml tree is assigned same ancestor group identifier. Based on this scheme, we propose a novel sql translation algorithm for not-twig query evaluation. Experiments carried out confirm that our proposed approach built on top of an off-the-shelf commercial rdbms significantly outperforms state-of-the-art relational and native approaches. We also explore the query plans selected by a commercial relational optimizer to evaluate our translated queries in different input cardinality. Such exploration further validates the performance benefits of Andes.

Keywords

NOT-twig Xpath XML query processing Relational database Performance Twig evaluation tree Plan diagrams 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Al-Khalifa, A., Jagadish, H.V.: Multi-level operator combination in XML query processing. In: ACM CIKM (2002)Google Scholar
  2. 2.
    Bamford, R., Vinayak et al.: XQuery reloaded. In: PVLDB (2009)Google Scholar
  3. 3.
    Bhowmick, S.S., Leonardi, E., Sun, H.: Efficient evaluation of high-selective xml twig patterns with parent child edges in tree-unaware RDBMS. In: ACM CIKM (2007)Google Scholar
  4. 4.
    Boncz, P., Grust, T. et al.: MonetDB/XQuery: a fast XQuery processor powered by a relational engine. In: Proceedings of the 2006 ACM SIGMOD international conference on management of data. ACM, New York (2006)Google Scholar
  5. 5.
    Boncz, P., Kersten, M.L.: MIL primitives for querying a fragmented world. VLDB J. 8(2) (1999)Google Scholar
  6. 6.
    Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: SIGMOD (2002)Google Scholar
  7. 7.
    Franceschet, M.: XPathMark: an XPath benchmark for the XMark generated data. In XSym (2005)Google Scholar
  8. 8.
    Garakani, V., Izadi, S.K., Haghjoo, M., Harizi, M.: NTJFsat¬: A novel method for query with not-predicates on XML data. In: CIKM (2007)Google Scholar
  9. 9.
    Georgiadis, H., Vassalos, V.: Xpath on steroids: exploiting relational engines for Xpath performance. In: SIGMOD (2007)Google Scholar
  10. 10.
    Georgiadis, H., et al.: Cost-based plan selection for XPath. In: SIGMOD (2009)Google Scholar
  11. 11.
    Gou, G., Chirkova, R.: Efficiently querying large xml data repositories: a survey. IEEE TKDE 19(10) (2007)Google Scholar
  12. 12.
    Grust, T., Rittinger, J., Teubner, J.: Why off-the-shelf RDBMSs are better at XPath than you might expect. In: SIGMOD (2007)Google Scholar
  13. 13.
    Grust, T., van Keulen, M., Teubner, J.: Staircase join: teaching a relational DBMS to watch its (axis) steps. In VLDB (2003)Google Scholar
  14. 14.
    Jiao, E., Ling, T.-W., Chan, C.-Y.: PathStack : a holistic path join algorithm for path query with not-predicates on XML data. In: DASFAA (2005)Google Scholar
  15. 15.
    Li, H., Lee, M.-L., Hsu, W.: A path-based labeling scheme for efficient structural join. In: XSym (2005)Google Scholar
  16. 16.
    Li, H., Lee, M.-L., Hsu, W., Li, L.: A path-based approach for efficient structural join with not-predicates. In DASFAA (2007)Google Scholar
  17. 17.
    Li C., Ling T.W., Hu M.: Efficient updates in dynamic XML data: from binary string to quaternary string. VLDB J. 17: 573–601, (2008)CrossRefGoogle Scholar
  18. 18.
    Lu, J., Ling, T.W., et al.: From region encoding to extended Dewey: on efficient processing of XML twig pattern matching. In: VLDB (2005)Google Scholar
  19. 19.
    Mayer, S., Grust, T. et al.: An injection with tree awareness: adding staircase join to PostgreSQL. In VLDB (2004)Google Scholar
  20. 20.
    O’Neal, P., O’Neal, E., Pal, S., et al.: ORDPATHs: insert-friendly XML node labels. In: SIGMOD (2004)Google Scholar
  21. 21.
    Pooja, H.D., Darera, N., Haritsa, J.R.: Identifying robust plans through plan diagram reduction. In: VLDB (2008)Google Scholar
  22. 22.
    Reddy, N., Haritsa, J.R.: Analyzing plan diagrams of database query optimizers. In: VLDB (2005)Google Scholar
  23. 23.
    Schmidt, A., Waas, F., Kersten, M., Carey, M.J., Manolescu, I., Busse, R.: XMark: a benchmark for XML data management. In VLDB (2002)Google Scholar
  24. 24.
    Seah, B.-S., Widjanarko, K.G., Bhowmick, S.S., et al.: Efficient support for ordered XPath processing in tree-unaware commercial relational databases. In: DASFAA (2007)Google Scholar
  25. 25.
    Shanmugasundaram, J., Tufte, K., et al.: Relational databases for querying xml documents: limitations and opportunities. In VLDB (1999)Google Scholar
  26. 26.
    Soh, K.H., Bhowmick, S.S.: Efficient evaluation of not-twig queries in A tree-unaware RDBMS. In: DASFAA (2011)Google Scholar
  27. 27.
    Stonebraker, M., Abadi, D., et al.: C-store: a column-oriented DBMS. In: VLDB (2005)Google Scholar
  28. 28.
    Tatarinov, I., Viglas, S., et al.: Storing and querying ordered xml using a relational database system. In: SIGMOD (2002)Google Scholar
  29. 29.
    ToXGene—the ToX XML data generator. http://www.cs.toronto.edu/tox/toxgene/
  30. 30.
    Wu, X., Lee, M.L., Hsu, W.: A prime number labeling scheme for dynamic ordered XML trees. In: ICDE (2004)Google Scholar
  31. 31.
    Xu, L., Ling, T.W., Wu, H., Bao, Z.: DDE: from Dewey to a fully dynamic XML labeling scheme. In: SIGMOD (2009)Google Scholar
  32. 32.
    Yoshikawa, M., et al.: XRel: a path-based approach to storage and retrieval of xml documents using relational databases. ACM TOIT 1(1) (2001)Google Scholar
  33. 33.
    Yao, B., Özsu, M.T., Khandelwal, N.: XBench: benchmark and performance testing of XML DBMSs. In ICDE (2004)Google Scholar
  34. 34.
    Yu, T., Ling, T.-W., Lu, J.: TwigStackList¬: a holistic twig join algorithm for twig query with not-predicates on XML data. In: DASFAA (2006)Google Scholar
  35. 35.
    Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On supporting containment queries in relational database management systems. In: SIGMOD (2001)Google Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • Kheng Hong Soh
    • 1
  • Ba Quan Truong
    • 1
  • Sourav S. Bhowmick
    • 1
  1. 1.School of Computer EngineeringNanyang Technological UniversitySingaporeSingapore

Personalised recommendations