Advertisement

TwigStackPrime: A Novel Twig Join Algorithm Based on Prime Numbers

  • Shtwai AlsubaiEmail author
  • Siobhán North
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 322)

Abstract

The growing number of XML documents leads to the need for appropriate XML querying algorithms which are able to utilize the specific characteristics of XML documents. A labelling scheme is fundamental to processing XML queries efficiently. They are used to determine structural relationships between elements corresponding to query nodes in twig pattern queries (TPQs). This article presents a design and implementation of a new indexing technique which exploits the property of prime numbers to identify Parent-Child (P-C) relationships in TPQs during query evaluation. The Child Prime Label (CPL, for short) approach can be efficiently incorporated within the existing labelling schemes. Here, we propose a novel twig matching algorithm based on the well known TwigStack algorithm [3], which applies the CPL approach and focuses on reducing the overhead of storing useless elements and performing unnecessary join operations. Our performance evaluation demonstrates that the new algorithm significantly outperforms the previous approaches.

Keywords

XML databases Holistic twig join algorithm Node labelling Twig pattern query 

References

  1. 1.
    Aghili, S.A., Li, H.-G., Agrawal, D., El Abbadi, A.: TWIX: twig structure and content matching of selective queries using. In: Proceedings of the 1st International Conference on InfoScale 2006, p. 42 (2006)Google Scholar
  2. 2.
    Alsubai, S., North, S.: A prime number approach to matching an XML twig pattern including parent-child edges. In: The 13th International Conference on Web Information Systems and Technologies, WEBIST 2017, pp. 204–211. SCITEPRESS Science and Technology Publications, Lda, Porto (2017)Google Scholar
  3. 3.
    Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 310–321. ACM, Madisonn (2002)Google Scholar
  4. 4.
    Chen, S., Li, H.-G., Tatemura, J., Hsiung, W.-P., Agrawal, D., Sel, K., Candan, K.S.: Twig2Stack: bottom-up processing of generalized-tree-pattern queries over XML documents (2006)Google Scholar
  5. 5.
    Chen, T., Lu, J., Ling, T.W.: On boosting holism in XML twig pattern matching using structural indexing techniques. In: Science, pp. 455–466 (2005)Google Scholar
  6. 6.
    Choi, B., Mahoui, M., Wood, D.: On the optimality of holistic algorithms for twig queries. In: Mařík, V., Retschitzegger, W., Štěpánková, O. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 28–37. Springer, Heidelberg (2003).  https://doi.org/10.1007/978-3-540-45227-0_4CrossRefGoogle Scholar
  7. 7.
    Grimsmo, N., Bjørklund, T.A., Hetland, M.L.: Fast optimal twig joins. VLDB 3(1–2), 894–905 (2010)Google Scholar
  8. 8.
    Li, J., Wang, J.: Fast matching of twig patterns. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2008. LNCS, vol. 5181, pp. 523–536. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-85654-2_45CrossRefGoogle Scholar
  9. 9.
    Lu, J., Chen, T., Ling, T.W.: Efficient processing of XML twig patterns with parent child edges: a look-ahead approach. In: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, no. i, pp. 533–542. ACM, Washington, D.C. (2004)Google Scholar
  10. 10.
    Jiaheng, L., Meng, X., Ling, T.W.: Indexing and querying XML using extended Dewey labeling scheme. Data Knowl. Eng. 70(1), 35–59 (2011)CrossRefGoogle Scholar
  11. 11.
    Qin, L., Yu, J.X., Ding, B.: TwigList: make twig pattern matching fast. In: Kotagiri, R., Krishna, P.R., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 850–862. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-71703-4_70CrossRefGoogle Scholar
  12. 12.
    Wu, H., Lin, C., Ling, T.W., Lu, J.: Processing XML twig pattern query with wildcards. In: Liddle, S.W., Schewe, K.-D., Tjoa, A.M., Zhou, X. (eds.) DEXA 2012. LNCS, vol. 7446, pp. 326–341. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-32600-4_24CrossRefGoogle Scholar
  13. 13.
    Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On supporting containment queries in relational database management systems. ACM SIGMOD Rec. 30, 425–436 (2001)CrossRefGoogle Scholar
  14. 14.
    Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Srivastava, D., Wu, Y.: Structural joins: a primitive for efficient XML query pattern matching. In: Proceedings of 18th International Conference on Data Engineering, pp. 141–152 (2002)Google Scholar
  15. 15.
    Haw, S.-C., Lee, C.-S.: Data storage practices and query processing in XML databases: a survey. Knowl. Based Syst. 24(8), 1317–1340 (2011)CrossRefGoogle Scholar
  16. 16.
    Gang, G., Chirkova, R.: Efficiently querying large XML data repositories: a survey. IEEE Trans. Knowl. Data Eng. 19(10), 1381–1403 (2007)CrossRefGoogle Scholar
  17. 17.
    Lu, J., Ling, T.W., Bao, Z., Wang, C.: Extended XML tree pattern matching: theories and algorithms. IEEE Trans. Knowl. Data Eng. 23(3), 402–416 (2011a)CrossRefGoogle Scholar
  18. 18.
    Goldman, R., Widom, J.: Dataguides: enabling query formulation and optimization in semistructured databases. In: Proceedings of International Conference on Very Large Data Bases, pp. 436–445 (1997)Google Scholar
  19. 19.
    Kaushik, R., Bohannon, P., Naughton, J.F., Korth, H.F.: Covering indexes for branching path queries. In: Proceedings of 2002 ACM SIGMOD International Conference on Management Data, SIGMOD 2002, p. 133 (2002)Google Scholar
  20. 20.
    Bača, R., Krátký, M., Ling, T.W., Lu, J.: Optimal and efficient generalized twig pattern processing: a combination of preorder and postorder filterings. VLDB J. 22(3), 369–393 (2012)CrossRefGoogle Scholar
  21. 21.
    Bača, R., Krátký, M.: XML query processing. In: Proceedings of 16th International Database Engineering Application Symposium, IDEAS 2012, p. 813 (2012)Google Scholar
  22. 22.
    Mathis, C., Härder, T., Schmidt, K., Bächle, S.: XML indexing and storage: fulfilling the wish list. Comput. Sci. Res. Dev. 30, 118 (2012)Google Scholar
  23. 23.
    Schmidt, A., Waas, F., Kersten, M., Busse, R., Carey, M.J., Amsterdam, G.B.: XMark: a benchmark for XML data management. In: VLDB 2002 Proceedings of the 28th International Conference on Very Large Data Bases, pp. 974–985 (2002)CrossRefGoogle Scholar
  24. 24.
    Miklau, G.: UW XMLData Repository. http://www.cs.washington.edu/research/xmldatasets/. Accessed 04 Feb 2016

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.College of Computer Engineering and SciencePrince Sattam bin Abdulaziz UniversityAl-KharjKingdom of Saudi Arabia
  2. 2.Department of Computer ScienceThe University of SheffieldSheffieldUK

Personalised recommendations