The VLDB Journal

, Volume 21, Issue 6, pp 889–914 | Cite as

ANDES: efficient evaluation of NOT-twig queries in relational databases

  • Kheng Hong Soh
  • Ba Quan Truong
  • Sourav S. Bhowmick
Regular Paper


Despite a large body of work on XPath query processing in relational environment, systematic study of queries containing not-predicates have received little attention in the literature. Particularly, several xml supports of industrial-strength commercial rdbms fail to efficiently evaluate such queries. In this paper, we present an efficient and novel strategy to evaluate not -twig queries in a tree-unaware relational environment. not -twig queries are XPath queries with ancestor–descendant and parent–child axis and contain one or more not-predicates. We propose a novel Dewey-based encoding scheme called Andes (ANcestor Dewey-based Encoding Scheme), which enables us to efficiently filter out elements satisfying a not-predicate by comparing their ancestor group identifiers. In this approach, a set of elements under the same common ancestor at a specific level in the xml tree is assigned same ancestor group identifier. Based on this scheme, we propose a novel sql translation algorithm for not-twig query evaluation. Experiments carried out confirm that our proposed approach built on top of an off-the-shelf commercial rdbms significantly outperforms state-of-the-art relational and native approaches. We also explore the query plans selected by a commercial relational optimizer to evaluate our translated queries in different input cardinality. Such exploration further validates the performance benefits of Andes.


NOT-twig Xpath XML query processing Relational database Performance Twig evaluation tree Plan diagrams 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Al-Khalifa, A., Jagadish, H.V.: Multi-level operator combination in XML query processing. In: ACM CIKM (2002)Google Scholar
  2. 2.
    Bamford, R., Vinayak et al.: XQuery reloaded. In: PVLDB (2009)Google Scholar
  3. 3.
    Bhowmick, S.S., Leonardi, E., Sun, H.: Efficient evaluation of high-selective xml twig patterns with parent child edges in tree-unaware RDBMS. In: ACM CIKM (2007)Google Scholar
  4. 4.
    Boncz, P., Grust, T. et al.: MonetDB/XQuery: a fast XQuery processor powered by a relational engine. In: Proceedings of the 2006 ACM SIGMOD international conference on management of data. ACM, New York (2006)Google Scholar
  5. 5.
    Boncz, P., Kersten, M.L.: MIL primitives for querying a fragmented world. VLDB J. 8(2) (1999)Google Scholar
  6. 6.
    Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: SIGMOD (2002)Google Scholar
  7. 7.
    Franceschet, M.: XPathMark: an XPath benchmark for the XMark generated data. In XSym (2005)Google Scholar
  8. 8.
    Garakani, V., Izadi, S.K., Haghjoo, M., Harizi, M.: NTJFsat¬: A novel method for query with not-predicates on XML data. In: CIKM (2007)Google Scholar
  9. 9.
    Georgiadis, H., Vassalos, V.: Xpath on steroids: exploiting relational engines for Xpath performance. In: SIGMOD (2007)Google Scholar
  10. 10.
    Georgiadis, H., et al.: Cost-based plan selection for XPath. In: SIGMOD (2009)Google Scholar
  11. 11.
    Gou, G., Chirkova, R.: Efficiently querying large xml data repositories: a survey. IEEE TKDE 19(10) (2007)Google Scholar
  12. 12.
    Grust, T., Rittinger, J., Teubner, J.: Why off-the-shelf RDBMSs are better at XPath than you might expect. In: SIGMOD (2007)Google Scholar
  13. 13.
    Grust, T., van Keulen, M., Teubner, J.: Staircase join: teaching a relational DBMS to watch its (axis) steps. In VLDB (2003)Google Scholar
  14. 14.
    Jiao, E., Ling, T.-W., Chan, C.-Y.: PathStack : a holistic path join algorithm for path query with not-predicates on XML data. In: DASFAA (2005)Google Scholar
  15. 15.
    Li, H., Lee, M.-L., Hsu, W.: A path-based labeling scheme for efficient structural join. In: XSym (2005)Google Scholar
  16. 16.
    Li, H., Lee, M.-L., Hsu, W., Li, L.: A path-based approach for efficient structural join with not-predicates. In DASFAA (2007)Google Scholar
  17. 17.
    Li C., Ling T.W., Hu M.: Efficient updates in dynamic XML data: from binary string to quaternary string. VLDB J. 17: 573–601, (2008)CrossRefGoogle Scholar
  18. 18.
    Lu, J., Ling, T.W., et al.: From region encoding to extended Dewey: on efficient processing of XML twig pattern matching. In: VLDB (2005)Google Scholar
  19. 19.
    Mayer, S., Grust, T. et al.: An injection with tree awareness: adding staircase join to PostgreSQL. In VLDB (2004)Google Scholar
  20. 20.
    O’Neal, P., O’Neal, E., Pal, S., et al.: ORDPATHs: insert-friendly XML node labels. In: SIGMOD (2004)Google Scholar
  21. 21.
    Pooja, H.D., Darera, N., Haritsa, J.R.: Identifying robust plans through plan diagram reduction. In: VLDB (2008)Google Scholar
  22. 22.
    Reddy, N., Haritsa, J.R.: Analyzing plan diagrams of database query optimizers. In: VLDB (2005)Google Scholar
  23. 23.
    Schmidt, A., Waas, F., Kersten, M., Carey, M.J., Manolescu, I., Busse, R.: XMark: a benchmark for XML data management. In VLDB (2002)Google Scholar
  24. 24.
    Seah, B.-S., Widjanarko, K.G., Bhowmick, S.S., et al.: Efficient support for ordered XPath processing in tree-unaware commercial relational databases. In: DASFAA (2007)Google Scholar
  25. 25.
    Shanmugasundaram, J., Tufte, K., et al.: Relational databases for querying xml documents: limitations and opportunities. In VLDB (1999)Google Scholar
  26. 26.
    Soh, K.H., Bhowmick, S.S.: Efficient evaluation of not-twig queries in A tree-unaware RDBMS. In: DASFAA (2011)Google Scholar
  27. 27.
    Stonebraker, M., Abadi, D., et al.: C-store: a column-oriented DBMS. In: VLDB (2005)Google Scholar
  28. 28.
    Tatarinov, I., Viglas, S., et al.: Storing and querying ordered xml using a relational database system. In: SIGMOD (2002)Google Scholar
  29. 29.
    ToXGene—the ToX XML data generator.
  30. 30.
    Wu, X., Lee, M.L., Hsu, W.: A prime number labeling scheme for dynamic ordered XML trees. In: ICDE (2004)Google Scholar
  31. 31.
    Xu, L., Ling, T.W., Wu, H., Bao, Z.: DDE: from Dewey to a fully dynamic XML labeling scheme. In: SIGMOD (2009)Google Scholar
  32. 32.
    Yoshikawa, M., et al.: XRel: a path-based approach to storage and retrieval of xml documents using relational databases. ACM TOIT 1(1) (2001)Google Scholar
  33. 33.
    Yao, B., Özsu, M.T., Khandelwal, N.: XBench: benchmark and performance testing of XML DBMSs. In ICDE (2004)Google Scholar
  34. 34.
    Yu, T., Ling, T.-W., Lu, J.: TwigStackList¬: a holistic twig join algorithm for twig query with not-predicates on XML data. In: DASFAA (2006)Google Scholar
  35. 35.
    Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On supporting containment queries in relational database management systems. In: SIGMOD (2001)Google Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • Kheng Hong Soh
    • 1
  • Ba Quan Truong
    • 1
  • Sourav S. Bhowmick
    • 1
  1. 1.School of Computer EngineeringNanyang Technological UniversitySingaporeSingapore

Personalised recommendations