PSMQ: Path Based Storage and Metadata Guided Twig Query Evaluation

  • M. Archana
  • M. Lakshmi Narayana
  • P. Sreenivasa Kumar
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4587)

Abstract

Efficient evaluation of queries on XML data is a major research issue. Structural join based techniques are well known for XPath evaluation. For the long path expressions, join techniques are not efficient as they increase the number of joins and disk I/O cost. Path based techniques try to reduce the number of joins. In this paper, we propose a metadata guided query evaluation technique which uses path based storage. We use interval encoding for the nodes. In addition, we use Strong DataGuide to assign integer path labels to distinct root-to-node label paths in the data tree. An element list is maintained for each distinct path consisting of nodes that can be reached by that path. The Element-Map gives the one-to-many mapping between element names (or tag names) to element lists with nodes having that tag-name. The Path-Map gives the root-to-leaf path for a given path label. Using these structures, we can combine top-down path matching and bottom-up path selections to efficiently evaluate linear path expressions. For twig queries, we perform structural joins at branch points. Through experimental evaluation on standard datasets, we show that our approach outperforms the existing path-index based approaches which in turn outperform structural join methods.

Keywords

DataGuide XPath structural summary structural join 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
    Berglund, A., Boag, S., Chamberlin, D., Simon, J., Fernandez, M.F., Kay, M., Robie, J.: XML Path Language (XPath) 2.0. Technical report, W3C Working Draft (2001), Available at http://www.w3.org/TR/XPath20/
  5. 5.
    Rajesh, A., Sreenivasa Kumar, P.: MQEB: Metadata-based Query Evaluation of Bi-labeled XML data. In: COMAD, pp. 53–60 (2005b)Google Scholar
  6. 6.
    Cooper, B.F., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A Fast Index for Semistructured Data. In: VLDB, pp. 341–350 (2001)Google Scholar
  7. 7.
    Chung, C.-W., Min, J.-K., Shim, K.: APEX: an adaptive path index for XML data. In: SIGMOD Conference, pp. 121–132 (2002)Google Scholar
  8. 8.
    Zhang, C., Naughton, J.F., DeWitt, D.J., Luo, Q., Lohman, G.M.: On Supporting Containment Queries in Relational Database Management Systems. In: SIGMOD Conference, pp. 425–436 (2001)Google Scholar
  9. 9.
    Chamberlin, D., Robie, J., Florescu, D., Simeon, J., Stefanescu, M.: XQuery: A Query Language for XML. Technical report, W3C Working Draft (February 2001), Available at http://www.w3.org/TR/xquery/
  10. 10.
    Yoshikawa, M., Amagasa, T., Shimura, T., Uemura, S.: XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM Trans. Internet Techn. 1(1), 110–141 (2001)CrossRefGoogle Scholar
  11. 11.
    Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: SIGMOD 2002. Proceedings of the 2002 ACM SIGMOD international conference on Management of data, pp. 310–321. ACM Press, New York (2002)CrossRefGoogle Scholar
  12. 12.
    Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: Jarke, M., Carey, M.J., Dittrich, K.R., Lochovsky, F.H., Loucopoulos, P., Jeusfeld, M.A. (eds.) VLDB 1997. Proceedings of 23rd International Conference on Very Large DataBases, pp. 436–445. Morgan Kaufmann, Washington (1997)Google Scholar
  13. 13.
    Abiteboul, S., Quass, D., McHugh, J., Widom, J., Wiener, J.L.: The Lorel Query Language for Semistructured Data. Int. J. on Digital Libraries 1(1), 68–88 (1997)Google Scholar
  14. 14.
    Al-Khalifa, S., Jagadish, H.V., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: ICDE 2002. Proceedings of the 18th International Conference on Data Engineering (ICDE’02), Washington, DC, USA, pp. 141–152. IEEE Computer Society, Washington (2002)Google Scholar
  15. 15.
    Grust, T., van Keulen, M., Teubner, J.: Staircase Join: Teach a Relational DBMS to Watch its (Axis) Steps. In: VLDB, pp. 524–525 (2003)Google Scholar
  16. 16.
    Cheng, J., Yu, G., Wang, G., Yu, J.X.: PathGuide: An Efficient Clustering Based Indexing Method for XML Path Expressions. In: DASFAA, pp. 257–264 (2003)Google Scholar
  17. 17.
    Chen, Y., Davidson, S.B., Zheng, Y.: BLAS: an efficient XPath processing system. In: SIGMOD 2004. Proceedings of the 2004 ACM SIGMOD international conference on Management of data, New York, NY, USA, pp. 47–58. ACM Press, New York (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • M. Archana
    • 1
  • M. Lakshmi Narayana
    • 1
  • P. Sreenivasa Kumar
    • 1
  1. 1.IIT Madras, Chennai,- 600036India

Personalised recommendations