Abstract
Efficient evaluation of queries on XML data is a major research issue. Structural join based techniques are well known for XPath evaluation. For the long path expressions, join techniques are not efficient as they increase the number of joins and disk I/O cost. Path based techniques try to reduce the number of joins. In this paper, we propose a metadata guided query evaluation technique which uses path based storage. We use interval encoding for the nodes. In addition, we use Strong DataGuide to assign integer path labels to distinct root-to-node label paths in the data tree. An element list is maintained for each distinct path consisting of nodes that can be reached by that path. The Element-Map gives the one-to-many mapping between element names (or tag names) to element lists with nodes having that tag-name. The Path-Map gives the root-to-leaf path for a given path label. Using these structures, we can combine top-down path matching and bottom-up path selections to efficiently evaluate linear path expressions. For twig queries, we perform structural joins at branch points. Through experimental evaluation on standard datasets, we show that our approach outperforms the existing path-index based approaches which in turn outperform structural join methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Berglund, A., Boag, S., Chamberlin, D., Simon, J., Fernandez, M.F., Kay, M., Robie, J.: XML Path Language (XPath) 2.0. Technical report, W3C Working Draft (2001), Available at http://www.w3.org/TR/XPath20/
Rajesh, A., Sreenivasa Kumar, P.: MQEB: Metadata-based Query Evaluation of Bi-labeled XML data. In: COMAD, pp. 53–60 (2005b)
Cooper, B.F., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A Fast Index for Semistructured Data. In: VLDB, pp. 341–350 (2001)
Chung, C.-W., Min, J.-K., Shim, K.: APEX: an adaptive path index for XML data. In: SIGMOD Conference, pp. 121–132 (2002)
Zhang, C., Naughton, J.F., DeWitt, D.J., Luo, Q., Lohman, G.M.: On Supporting Containment Queries in Relational Database Management Systems. In: SIGMOD Conference, pp. 425–436 (2001)
Chamberlin, D., Robie, J., Florescu, D., Simeon, J., Stefanescu, M.: XQuery: A Query Language for XML. Technical report, W3C Working Draft (February 2001), Available at http://www.w3.org/TR/xquery/
Yoshikawa, M., Amagasa, T., Shimura, T., Uemura, S.: XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM Trans. Internet Techn. 1(1), 110–141 (2001)
Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: SIGMOD 2002. Proceedings of the 2002 ACM SIGMOD international conference on Management of data, pp. 310–321. ACM Press, New York (2002)
Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: Jarke, M., Carey, M.J., Dittrich, K.R., Lochovsky, F.H., Loucopoulos, P., Jeusfeld, M.A. (eds.) VLDB 1997. Proceedings of 23rd International Conference on Very Large DataBases, pp. 436–445. Morgan Kaufmann, Washington (1997)
Abiteboul, S., Quass, D., McHugh, J., Widom, J., Wiener, J.L.: The Lorel Query Language for Semistructured Data. Int. J. on Digital Libraries 1(1), 68–88 (1997)
Al-Khalifa, S., Jagadish, H.V., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: ICDE 2002. Proceedings of the 18th International Conference on Data Engineering (ICDE’02), Washington, DC, USA, pp. 141–152. IEEE Computer Society, Washington (2002)
Grust, T., van Keulen, M., Teubner, J.: Staircase Join: Teach a Relational DBMS to Watch its (Axis) Steps. In: VLDB, pp. 524–525 (2003)
Cheng, J., Yu, G., Wang, G., Yu, J.X.: PathGuide: An Efficient Clustering Based Indexing Method for XML Path Expressions. In: DASFAA, pp. 257–264 (2003)
Chen, Y., Davidson, S.B., Zheng, Y.: BLAS: an efficient XPath processing system. In: SIGMOD 2004. Proceedings of the 2004 ACM SIGMOD international conference on Management of data, New York, NY, USA, pp. 47–58. ACM Press, New York (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Archana, M., Narayana, M.L., Kumar, P.S. (2007). PSMQ: Path Based Storage and Metadata Guided Twig Query Evaluation. In: Cooper, R., Kennedy, J. (eds) Data Management. Data, Data Everywhere. BNCOD 2007. Lecture Notes in Computer Science, vol 4587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73390-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-73390-4_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73389-8
Online ISBN: 978-3-540-73390-4
eBook Packages: Computer ScienceComputer Science (R0)