Skip to main content

PSMQ: Path Based Storage and Metadata Guided Twig Query Evaluation

  • Conference paper
Data Management. Data, Data Everywhere (BNCOD 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4587))

Included in the following conference series:

  • 619 Accesses

Abstract

Efficient evaluation of queries on XML data is a major research issue. Structural join based techniques are well known for XPath evaluation. For the long path expressions, join techniques are not efficient as they increase the number of joins and disk I/O cost. Path based techniques try to reduce the number of joins. In this paper, we propose a metadata guided query evaluation technique which uses path based storage. We use interval encoding for the nodes. In addition, we use Strong DataGuide to assign integer path labels to distinct root-to-node label paths in the data tree. An element list is maintained for each distinct path consisting of nodes that can be reached by that path. The Element-Map gives the one-to-many mapping between element names (or tag names) to element lists with nodes having that tag-name. The Path-Map gives the root-to-leaf path for a given path label. Using these structures, we can combine top-down path matching and bottom-up path selections to efficiently evaluate linear path expressions. For twig queries, we perform structural joins at branch points. Through experimental evaluation on standard datasets, we show that our approach outperforms the existing path-index based approaches which in turn outperform structural join methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. http://www.sax.sourceforge.net/

  2. http://www.cs.washington.edu/research/xmldatasets

  3. http://www.sleepycat.com/

  4. Berglund, A., Boag, S., Chamberlin, D., Simon, J., Fernandez, M.F., Kay, M., Robie, J.: XML Path Language (XPath) 2.0. Technical report, W3C Working Draft (2001), Available at http://www.w3.org/TR/XPath20/

  5. Rajesh, A., Sreenivasa Kumar, P.: MQEB: Metadata-based Query Evaluation of Bi-labeled XML data. In: COMAD, pp. 53–60 (2005b)

    Google Scholar 

  6. Cooper, B.F., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A Fast Index for Semistructured Data. In: VLDB, pp. 341–350 (2001)

    Google Scholar 

  7. Chung, C.-W., Min, J.-K., Shim, K.: APEX: an adaptive path index for XML data. In: SIGMOD Conference, pp. 121–132 (2002)

    Google Scholar 

  8. Zhang, C., Naughton, J.F., DeWitt, D.J., Luo, Q., Lohman, G.M.: On Supporting Containment Queries in Relational Database Management Systems. In: SIGMOD Conference, pp. 425–436 (2001)

    Google Scholar 

  9. Chamberlin, D., Robie, J., Florescu, D., Simeon, J., Stefanescu, M.: XQuery: A Query Language for XML. Technical report, W3C Working Draft (February 2001), Available at http://www.w3.org/TR/xquery/

  10. Yoshikawa, M., Amagasa, T., Shimura, T., Uemura, S.: XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM Trans. Internet Techn. 1(1), 110–141 (2001)

    Article  Google Scholar 

  11. Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: SIGMOD 2002. Proceedings of the 2002 ACM SIGMOD international conference on Management of data, pp. 310–321. ACM Press, New York (2002)

    Chapter  Google Scholar 

  12. Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: Jarke, M., Carey, M.J., Dittrich, K.R., Lochovsky, F.H., Loucopoulos, P., Jeusfeld, M.A. (eds.) VLDB 1997. Proceedings of 23rd International Conference on Very Large DataBases, pp. 436–445. Morgan Kaufmann, Washington (1997)

    Google Scholar 

  13. Abiteboul, S., Quass, D., McHugh, J., Widom, J., Wiener, J.L.: The Lorel Query Language for Semistructured Data. Int. J. on Digital Libraries 1(1), 68–88 (1997)

    Google Scholar 

  14. Al-Khalifa, S., Jagadish, H.V., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: ICDE 2002. Proceedings of the 18th International Conference on Data Engineering (ICDE’02), Washington, DC, USA, pp. 141–152. IEEE Computer Society, Washington (2002)

    Google Scholar 

  15. Grust, T., van Keulen, M., Teubner, J.: Staircase Join: Teach a Relational DBMS to Watch its (Axis) Steps. In: VLDB, pp. 524–525 (2003)

    Google Scholar 

  16. Cheng, J., Yu, G., Wang, G., Yu, J.X.: PathGuide: An Efficient Clustering Based Indexing Method for XML Path Expressions. In: DASFAA, pp. 257–264 (2003)

    Google Scholar 

  17. Chen, Y., Davidson, S.B., Zheng, Y.: BLAS: an efficient XPath processing system. In: SIGMOD 2004. Proceedings of the 2004 ACM SIGMOD international conference on Management of data, New York, NY, USA, pp. 47–58. ACM Press, New York (2004)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Richard Cooper Jessie Kennedy

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Archana, M., Narayana, M.L., Kumar, P.S. (2007). PSMQ: Path Based Storage and Metadata Guided Twig Query Evaluation. In: Cooper, R., Kennedy, J. (eds) Data Management. Data, Data Everywhere. BNCOD 2007. Lecture Notes in Computer Science, vol 4587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73390-4_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73390-4_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73389-8

  • Online ISBN: 978-3-540-73390-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics