Abstract
In data integration systems, queries posed to a mediator need to be translated into a sequence of queries to the underlying data sources. In a heterogeneous environment, with sources of diverse and limited query capabilities, not all the translations are feasible. In this paper, we study the problem of finding feasible and efficient query plans for mediator systems. We consider conjunctive queries on mediators and model the source capabilities through attribute-binding adornments. We use a simple cost model that focuses on the major costs in mediation systems, those involved with sending queries to sources and getting answers back. Under this metric, we develop two algorithms for source query sequencing - one based on a simple greedy strategy and another based on a partitioning scheme. The first algorithm produces optimal plans in some scenarios, and we show a linear bound on its worst case performance when it misses optimal plans. The second algorithm generates optimal plans in more scenarios, while having no bound on the margin by which it misses the optimal plans. We also report on the results of the experiments that study the performance of the two algorithms.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
P. Apers, A. Hevner, S. Yao. Optimization Algorithms for Distributed Queries. In IEEE Trans. Software Engineering, 9(1), 1983.
P. Bernstein, N. Goodman, E. Wong, C. Reeve, J. Rothnie. Query Processing in a System for Distributed Databases (SDD-1). In ACM Trans. Database Systems, 6(4), 1981.
S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, J. Widom. The TSIMMIS project: Integration of heterogeneous information sources. In IPSJ, Japan, 1994.
S. Cluet, G. Moerkotte. On the Complexity of Generating Optimal Left-deep Processing Trees with Cross Products. In ICDT Conference, 1995.
R. Epstein, M. Stonebraker. Analysis of Distributed Database Strategies. In VLDB Conference, 1980.
C. Galindo-Legaria, A. Pellenkoft, M. Kersten. Fast, Randomized Join Order Selection-Why Use Transformations? In VLDB Conference, 1994.
M. Garey, D. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, San Francisco, 1979.
L. Haas, D. Kossman, E.L. Wimmers, J. Yang. Optimizing queries across diverse data sources. In VLDB Conference, 1997.
J. Hammer, H. Garcia-Molina, S. Nestorov, R. Yerneni, M. Breunig, V. Vassalos. Template-Based Wrappers in the TSIMMIS System. In SIGMOD Conference, 1997.
T. Ibaraki, T. Kameda. On the Optimal Nesting Order for Computing N-relational Joins. In ACM Trans. Database Systems, 9(3), 1984.
Y. Ioannidis, Y. Kang. Randomized Algorithms for Optimizing Large Join Queries. In SIGMOD Conference, 1990.
Y. Ioannidis, E. Wong. Query Optimization by Simulated Annealing. In SIGMOD Conference, 1987.
R. Krishnamurthy, H. Boral, C. Zaniolo. Optimization of Non-recursive Queries. In VLDB Conference, 1986.
A. Levy, A. Rajaraman, J. Ordille. Querying Heterogeneous Information Sources Using Source Descriptions. In VLDB Conference, 1996.
C. Li, R. Yerneni, V. Vassalos, H. Garcia-Molina, Y. Papakonstantinou, J. Ullman, M. Valiveti. Capability Based Mediation in TSIMMIS. In SIGMOD Conference, 1998.
K. Morris. An algorithm for ordering subgoals in NAIL!. In ACM PODS, 1988.
K. Ono, G. Lohman. Measuring the Complexity of Join Enumeration in Query Optimization. In VLDB Conference, 1990.
C. Papadimitriou, K. Steiglitz. Combinatorial Optimization: Algorithms and Complexity. Prentice-Hall, 1982.
Y. Papakonstantinou, A. Gupta, L. Haas. Capabilities-based Query Rewriting in Mediator Systems. In PDIS Conference, 1996.
A. Pellenkoft, C. Galindo-Legaria, M. Kersten. The Complexity of Transformation-Based Join Enumeration. In VLDB Conference, 1997.
W. Scheufele, G. Moerkotte. On the Comlexity of Generating Optimal Plans with Cartesian Products. In PODS Conference, 1997.
P. Selinger, M. Adiba. Access Path Selection in Distributed Databases Management Systems. In Readings in Database Systems. Edited by M. Stonebraker. Morgan-Kaufman Publishers, 1994.
M. Steinbrunn, G. Moerkotte, A. Kemper. Heuristic and Randomized Optimization for the Join Ordering Problem. In VLDB Journal, 6(3), 1997.
A. Swami. Optimization of Large Join Queries: Combining Heuristic and Combinatorial Techniques. In SIGMOD Conference, 1989.
A. Swami, A. Gupta. Optimization of Large Join Queries. In SIGMOD Conference, 1988.
A. Tomasic, L. Raschid, P. Valduriez. Scaling Heterogeneous Databases and the Design of Disco. In Int. Conf. on Distributed Computing Systems, 1996.
J. Ullman. Principles of Database and Knowledge-base Systems, Volumes I, II. Computer Science Press, Rockville MD.
J. Ullman, M. Vardi. The Complexity of Ordering Subgoals. In ACM PODS, 1988.
B. Vance, D. Maier. Rapid Bushy Join-Order Optimization with Cross Products. In SIGMOD Conference, 1996.
V. Vassalos, Y. Papakonstantinou. Describing and using query capabilities of heterogeneous sources. In VLDB Conference, 1997.
G. Wiederhold. Mediators in the Architecture of Future Information Systems. In IEEE Computer, 25:38–49, 1992.
R. Yerneni, C. Li, J. Ullman, H. Garcia-Molina. Optimizing Large Join Queries in Mediation Systems. http://www-db.stanford.edu/pub/papers/ljq.ps
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yerneni, R., Li, C., Ullman, J., Garcia-Molina, H. (1999). Optimizing Large Join Queries in Mediation Systems. In: Beeri, C., Buneman, P. (eds) Database Theory — ICDT’99. ICDT 1999. Lecture Notes in Computer Science, vol 1540. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49257-7_22
Download citation
DOI: https://doi.org/10.1007/3-540-49257-7_22
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65452-0
Online ISBN: 978-3-540-49257-3
eBook Packages: Springer Book Archive