Abstract
Rewriting queries using views is a powerful technique that has applications in data integration, data warehousing and query optimization. Query rewriting in relational databases is by now rather well investigated. However, in the framework of semistructured data the problem of rewriting has received much less attention. In this paper we identify some difficulties with currently known methods for using rewritings in semistructured databases. We study the problem in a realistic setting, proposed in information integration systems such as the Information Manifold, in which the data sources are modelled as sound views over a global schema. We give a new rewriting, which we call the possibility rewriting, that can be used in pruning the search space when answering queries using views. The possibility rewriting can be computed in time polynomial in the size of the original query and the view definitions. Finally, we show by means of a realistic example that our method can reduce the search space by an order of magnitude.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Abiteboul. Querying Semistructured Data. Proc. of Intl. Conference on Database Theory (ICDT) 1997, pp. 1–18.
S. Abiteboul, P. Buneman and D. Suciu. Data on the Web: From Relations to Semistructured Data and Xml. Morgan Kaufmann, San Francisco, 1999.
S. Abiteboul, R. Hull and V. Vianu. Foundations of Databases. Addison-Wesley, Reading, Mass. 1995.
S. Abiteboul, D. Quass, J. McHugh, J. Widom and J. L. Wiener. The Lorel Query Language for Semistructured Data. Int. J. on Digital Libraries 1(1) 1997, pp. 68–88.
P. Buneman. Semistructured Data. Proc. of the 16 th ACM Symposium on Principles of Database Systems (PODS) 1997, pp. 117–121.
P. Buneman, S. B. Davidson, M. F. Fernandez and D. Suciu. Adding Structure to Unstructured Data. Proc. of Intl. Conference on Database Theory (ICDT) 1997, pp. 336–350.
J. A. Brzozowski. Derivatives of Regular Expressions. J. of ACM 11(4) 1964, pp. 481–494
J. A. Brzozowski and E. L. Leiss. On Equations for Regular Languages, Finite Automata, and Sequential Networks. Theoretical Computer Science 10, 1980, pp. 19–35
D. Calvanese, G. Giacomo, M. Lenzerini and M. Y. Vardi. Rewriting of Regular Expressions and Regular Path Queries. Proc. of the 18 th ACM Symposium on Principles of Database Systems (PODS) 1999, pp. 194–204.
D. Calvanese, G. Giacomo, M. Lenzerini and M. Y. Vardi. Answering Regular Path Queries Using Views. Proc. of Intl. Conference on Data Engineering (ICDE) 2000, pp. 389–398
S. Cohen, W. Nutt, A. Serebrenik. Rewriting Aggregate Queries Using Views. Proc. of the 18 th ACM Symposium on Principles of Database Systems (PODS) 1999, pp. 155–166
J. H. Conway. Regular Algebra and Finite Machines. Chapman and Hall, London, 1971.
A. Deutsch, M. F. Fernandez, D. Florescu, A. Y. Levy, D. Suciu. A Query Language for XML. WWW8 / Computer Networks 31(11–16) 1999, pp. 1155–116.
O. Duschka and M. R. Genesereth. Answering Recursive Queries Using Views. Proc. of the 16 th ACM Symposium on Principles of Database Systems (PODS) 1997, pp. 109–116.
M. F. Fernandez and D. Suciu. Optimizing Regular path Expressions Using Graph Schemas Proc. of Intl. Conference on Data Engineering (ICDE) 1998, pp. 14–23.
D. Florescu, A. Y. Levy, D. Suciu Query Containment for Conjunctive Queries with Regular Expressions Proc. of the 17 th ACM Symposium on Principles of Database Systems (PODS) 1998, pp. 139–148.
G. Grahne and A. O. Mendelzon. Tableau Techniques for Querying Information Sources through Global Schemas. Proc. of Intl. Conference on Database Theory (ICDT) 1999, pp. 332–347.
J. E. Hopcroft and J. D. Ullman Introduction to Automata Theory, Languages, and Computation. Addison-Wesley 1979.
A. Y. Levy. Answering queries using views: a survey. Submitted for publication 1999.
A. Y. Levy, A. O. Mendelzon, Y. Sagiv, D. Srivastava. Answering Queries Using Views. Proc. of the 14 th ACM Symposium on Principles of Database Systems (PODS) 1995, pp. 95–104.
A. O. Mendelzon and P. T. Wood, Finding Regular Simple Paths in Graph Databases. SIAM J. Comp. 24(6), 1995, pp. 1235–1258.
A. O. Mendelzon, G. A. Mihaila and T. Milo. Querying the World Wide Web. Int. J. on Digital Libraries 1(1), 1997, pp. 54–67.
T. Milo and D. Suciu. Index Structures for Path Expressions. Proc. of Intl. Conference on Database Theory (ICDT), 1999, pp. 277–295.
Y. Papakonstantinou, V. Vassalos. Query Rewriting for Semistructured Data. Proc. of SIGMOD 1999, pp. 455–466.
J. D. Ullman. Information Integration Using Logical Views. Proc. of Intl. Conference on Database Theory (ICDT) 1997, pp. 19–40.
M. Y. Vardi. The universal-relation model for logical independence. IEEE Software 5(2), 1988, pp. 80–85.
S. Yu. Reqular Languages. In: Handbook of Formal Languages. G. Rozenberg and A. Salomaa (Eds.) Springer Verlag, Berlin 1997, pp. 41–110
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grahne, G., Thomo, A. (2001). An Optimization Technique for Answering Regular Path Queries. In: Goos, G., Hartmanis, J., van Leeuwen, J., Suciu, D., Vossen, G. (eds) The World Wide Web and Databases. WebDB 2000. Lecture Notes in Computer Science, vol 1997. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45271-0_14
Download citation
DOI: https://doi.org/10.1007/3-540-45271-0_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41826-9
Online ISBN: 978-3-540-45271-3
eBook Packages: Springer Book Archive