Query Answering and Containment for Regular Path Queries under Distortions
We give a general framework for approximate query processing in semistructured databases. We focus on regular path queries, which are the integral part of most of the query languages for semistructured databases. To enable approximations, we allow the regular path queries to be distorted. The distortions are expressed in the system by using weighted regular expressions, which correspond to weighted regular transducers. After defining the notion of weighted approximate answers we show how to compute them in order of their proximity to the query. In the new approximate setting, query containment has to be redefined in order to take into account the quantitative proximity information in the query answers. For this, we define approximate containment, and its variants k-containment and reliable containment. Then, we give an optimal algorithm for deciding the k-containment. Regarding the reliable approximate containment, we show that it is polynomial time equivalent to the notorious limitedness problem in distance automata.
KeywordsRegular Expression Graph Database Query Answer Query Answering Semistructured Data
Unable to display preview. Download preview PDF.
- [ABS99]Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web: From Relations to Semistructured Data and Xml. Morgan Kaufmann Pulishers, San Francisco (1999)Google Scholar
- [C+99]Calvanese, D., Giacomo, G., Lenzerini, M., Vardi, M.Y.: Rewriting of Regular Expressions and Regular Path Queries. In: Proc. PODS 1999, pp. 194–204 (1999)Google Scholar
- [C+00]Calvanese, D., Giacomo, G., Lenzerini, M., Vardi, M.Y.: View-Based Query Processing and Constraint Satisfaction. In: Proc. LICS 2000, pp. 361–371 (2000)Google Scholar
- [GT01]Grahne, G., Thomo, A.: Algebraic rewritings for optimizing regular path queries. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 303–315. Springer, Heidelberg (2000)Google Scholar
- [Kru83]Kruskal, J.: An Overview of Sequence Comparison. In: Sankoff, D., Kruskal, J. (eds.) Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, pp. 1–44. Addison-Wesley, Reading (1983)Google Scholar
- [JMM95]Jagadish, H.V., Mendelzon, A.O., Milo, T.: Similarity-Based Queries. In: Proc. PODS 1995, pp. 36–45 (1995)Google Scholar
- [MMM97]Mendelzon, A.O., Mihaila, G.A., Milo, T.: Querying the World Wide Web. Int. J. Dig. Lib. 1(1), 57–67 (1997)Google Scholar