Abstract
XPath 1 [4] is a practical language for selecting nodes from XML document trees and plays an essential role in other XML-related technologies such as XSLT and XQuery. Implementations of XPath need to scale well both with respect to the size of the XML data and the growing size and intricacy of the queries (i.e., combined complexity). Unfortunately, current XPath engines use query evaluation techniques that require time exponential in the size of queries in the worst case [1].
The main aim of this tutorial is to show that and how XPath can be processed much more efficiently. We present an algorithm for this problem with polynomial-time combined query evaluation complexity. Then we discuss various improvements over the basic evaluation algorithm, such as the context-restriction technique of [3], which lead to better worst-case bounds. We provide an overview over XPath fragments that can be processed yet more efficiently, most prominently one that can be evaluated in linear time (cf. [1,3]). Next we discuss the parallel complexity of XPath. While full XPath is P-complete w.r.t. combined complexity, various (minor) restrictions are known which reduce the complexity to highly parallelizable complexity classes [2]. Finally, we give an overview of further recent work on efficient XPath processing, in particular using logic- and automata-based techniques.
Keywords
- Query Evaluation
- Large Data Base
- Parallel Complexity
- Austrian Science Fund
- XPath Query
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This tutorial is based on joint work with Reinhard Pichler and was partially supported by the Austrian Science Fund (FWF) under project No. Z29-N04 and the GAMES Network of Excellence of the European Union. The second author’s work was sponsored by Erwin Schrödinger Grant J2169 of the FWF. Further information, including the slides of this tutorial, can be found at http://www.xmltaskforce.com .
This is a preview of subscription content, access via your institution.
Buying options
References
Gottlob, G., Koch, C., Pichler, R.: Efficient Algorithms for Processing XPath Queries. In: Proceedings of the 28th International Conference on Very Large Data Bases (VLDB 2002), Hong Kong, China (2002)
Gottlob, G., Koch, C., Pichler, R.: The Complexity of XPath Query Processing. In: Proceedings of the 22nd ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS 2003), San Diego, California (2003)
Gottlob, G., Koch, C., Pichler, R.: XPath Query Evaluation: Improving Time and Space Efficiency. In: Proceedings of the 19th IEEE International Conference on Data Engineering (ICDE 2003), Bangalore, India (March 2003)
World Wide Web Consortium. XML Path Language (XPath) Recommendation (November 1999), http://www.w3c.org/TR/xpath/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gottlob, G., Koch, C. (2004). XPath Query Processing. In: Lausen, G., Suciu, D. (eds) Database Programming Languages. DBPL 2003. Lecture Notes in Computer Science, vol 2921. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24607-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-24607-7_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20896-9
Online ISBN: 978-3-540-24607-7
eBook Packages: Springer Book Archive