Efficient dynamic programming algorithms for ordering expensive joins and selections
The generally accepted optimization heuristics of pushing selections down does not yield optimal plans in the presence of expensive predicates. Therefore, several researchers have proposed algorithms to compute optimal processing trees for queries with expensive predicates. All these approaches are incorrect—with one exception . Our contribution is as follows. We present a formally derived and correct dynamic programming algorithm to compute optimal bushy processing trees for queries with expensive predicates. This algorithm is then enhanced to be able to (1) handle several join algorithms including sort merge with a correct handling of interesting sort orders, to (2) perform predicate splitting, to (3) exploit structural information about the query graph to cut down the search space. Further, we present efficient implementations of the algorithms. More specifically we introduce unique solutions for efficiently computing the cost of the intermediate plans and for saving memory space by utilizing bitvector contraction. Our implementations impose no restrictions on the type of query graphs, the shape of processing trees or the class of cost functions. We establish the correctness of our algorithms and derive tight asymptotic bounds on the worst case time and space complexities. We also report on a series of benchmarks showing that queries of sizes which are likely to occur in practice can be optimized over the unconstrained search space in less than a second.
Unable to display preview. Download preview PDF.
- 1.S. Chaudhuri and K. Shim. Query optimization in the presence of foreign functions. In Proc. Int. Conf. on Very Large Data Bases (VLDB), pages 529–542, Dublin, Ireland, 1993.Google Scholar
- 2.S. Chaudhuri and K. Shim. Optimization of queries with user-defined predicates. In Proc. Int. Conf. on Very Large Data Bases (VLDB), pages 87–98, Bombay, India, 1996.Google Scholar
- 3.S. Chaudhuri and K. Shim. Optimization of queries with user-defined predicates. Technical report, Microsoft Research, Advanced Technology Division, One Microsoft Way, Redmond, WA 98052, USA, 1997.Google Scholar
- 4.R. Gamboa D. Chimenti and R. Krishnamurthy. Towards an open architecture for LDL. In Proc. Int. Conf. on Very Large Data Bases (VLDB), pages 195–203, Amsterdam, Netherlands, August 1989.Google Scholar
- 5.J. Hellerstein and M. Stonebraker. Predicate migration: Optimizing queries with expensive predicates. In Proc. of the ACM SIGMOD Conf. on Management of Data, pages 267–277, Washington, DC, 1993.Google Scholar
- 6.J. M. Hellerstein. Practical predicate placement. In Proc. of the ACM SIGMOD Conf. on Management of Data, pages 325–335, Minneapolis, Minnesota, USA, May 1994.Google Scholar
- 7.A. Kemper, G. Moerkotte, and M. Steinbrunn. Optimization of boolean expressions in object bases. In Proc. Int. Conf. on Very Large Data Bases (VLDB), pages 79–90, 1992.Google Scholar
- 8.M. Minoux. Mathematical Programming. Theory and Algorithms. Wiley, 1986.Google Scholar
- 9.T. L. Morin. Monotonicity and the principle of optimality. J. Math. Anal. and Appl., 1977.Google Scholar
- 10.W. Scheufele and G. Moerkotte. Efficient dynamic programming algorithms for ordering expensive joins and selections. Forthcoming Technical Report, Lehrstuhl für Praktische Informatik III, UniversitÄt Mannheim, 68131 Mannheim, Germany, 1998.Google Scholar
- 11.C. E. Leiserson T. H. Cormen and R. L. Rivest. Introduction to Algorithms. MIT Press, Cambridge, Massachusetts, USA, 1990.Google Scholar
- 12.J. D. Ullman. Principles of Database and Knowledge-Base Systems, volume II: The New Technologies. Computer Science Press, 1989.Google Scholar
- 13.B. Vance and D. Maier. Rapid bushy join-order optimization with cartesian products. In Proc. of the ACM SIGMOD Conf. on Management of Data, pages 35–46, Toronto, Canada, 1996.Google Scholar