Join Order Selection ( Good Enough Is Easy )

  • Florian Waas
  • Arjan Pellenkoft
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1832)

Abstract

Uniform sampling of join orders is known to be a competitive alternative to transformation-based optimization techniques. However, uniformity of the sampling process is difficult to establish and only for a restricted class of join queries techniques are known.

In this paper, we investigate non-uniform sampling devising a simple yet powerful algorithm that is generally applicable. The key element of the algorithm is a mapping of randomly generated sequences of join predicates to query plans. We take advantage of the bottom-up constructing of query plans by simultaneously computing the costs and discarding partial plans as soon as they exceed the best costs found so far, which implements a highly effective cost-bound pruning component.

Sampling does not produce the optimal plan but a near-optimal solution which is fully sufficient as the cost function grows more and more inaccurate with increasing query size. In return, our algorithm establishes a well-balanced trade-off between result quality and time invested in the optimization process.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [EN94]
    E. Elmasri and S. B. Navathe. Fundamentals of Database Sytems. Benjamin/Cummings, Redwood City, CA, USA, 2nd edition, 1994.Google Scholar
  2. [GLPK94]
    C. A. Galindo-Legaria, A. Pellenkoft, and M. L. Kersten. Fast, Randomized Join-Order Selection — Why Use Transformations? In Proc. of the Int’l. Conf. on Very Large Data Bases, pages 85–95, Santiago, Chile, September 1994.Google Scholar
  3. [GLPK95]
    C. A. Galindo-Legaria, A. Pellenkoft, and M. L. Kersten. Uniformly-distributed Random Generation of Join Orders. In Proc. of the Int’l. Conf. on Database Theory, pages 280–293, Prague, Czech Republic, January 1995.Google Scholar
  4. [IC91]
    Y. E. Ioannidis and S. Christodoulakis. On the Propagation of Errors in the Size of Join Results. In Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pages 268–277, Denver, CO, USA, May 1991.Google Scholar
  5. [IK91]
    Y. E. Ioannidis and Y. C. Kang. Left-Deep vs. Bushy Trees: An Analysis of Strategy Spaces and its Implications for Query Optimization. In Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pages 168–177, Denver, CO, USA, May 1991.Google Scholar
  6. [IW87]
    Y. E. Ioannidis and E. Wong. Query Optimization by Simulated Annealing. In Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pages 9–22, San Francisco, CA, USA, May 1987.Google Scholar
  7. [KBZ86]
    R. Krishnamurthy, H. Boral, and C. Zaniolo. Optimization of Nonrecursive Queries. In Proc. of the Int’l. Conf. on Very Large Data Bases, pages 128–137, Kyoto, Japan, August 1986.Google Scholar
  8. [KS91]
    H. Korth and A. Silberschatz. Database Systems Concepts. McGraw-Hill, Inc., New York, San Francisco, Washington, DC, USA, 1991.Google Scholar
  9. [LVZ93]
    R. S. G. Lanzelotte, P. Valduriez, and M. Zaït. On the Effectiveness of Optimization Search Strategies for Parallel Execution Spaces. In Proc. of the Int’l. Conf. on Very Large Data Bases, pages 493–504, Dublin, Ireland, August 1993.Google Scholar
  10. [Pel97]
    A. Pellenkoft. Probabilistic and Transformation based Query Optimization. PhD thesis, Universiteit van Amsterdam, Amsterdam, The Netherlands, 1997.Google Scholar
  11. [PGLK97]
    A. Pellenkoft, C. A. Galindo-Legaria, and M. L. Kersten. The Complexity of Transformation-Based Join Enumeration. In Proc. of the Int’l. Conf. on Very Large Data Bases, pages 306–315, Athens, Greece, September 1997.Google Scholar
  12. [SAC+79]_P. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access Path Selection in a Relational Database Management System. In Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pages 23–34, Boston, MA, USA, May 1979.Google Scholar
  13. [SG88]
    A. Swami and A. Gupta. Optimizing Large Join Queries. In Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pages 8–17, Chicago, IL, USA, June 1988.Google Scholar
  14. [SI93]
    A. Swami and B. R. Iyer. A Polynomial Time Algorithm for Optimizing Join Queries. In Proc. of the IEEE Int’l. Conf. on Data Engineering, pages 345–354, Vienna, Austria, April 1993.Google Scholar
  15. [SM97]
    W. Scheufele and G. Moerkotte. On the Complexity of Generating Optimal Plans with Cross Products. In Proc. of the ACM SIGACT-SIGMODSIGART Symposium on Principles of Database Systems, pages 238–248, Tucson, AZ, USA, May 1997.Google Scholar
  16. [SMK97]
    M. Steinbrunn, G. Moerkotte, and A. Kemper. Heuristic and Randomized Optimization for the Join Ordering Problem. The VLDB Journal, 6(3):191–208, August 1997.CrossRefGoogle Scholar
  17. [Ste96]
    M. Steinbrunn. Heuristic and Randomised Optimisation Techniques in Object-Oriented Database. DISDBIS. infix, Sankt Augustin, Germany, 1996.Google Scholar
  18. [Swa89]
    A. Swami. Optimization of Large Join Queries: Combining Heuristics and Combinatorial Techniques. In Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pages 367–376, Portland, OR, USA, June 1989.Google Scholar
  19. [Tra98]
    Transaction Processing Performance Council, San Jose, CA, USA. TPC Benchmark D (Decision Support), Revision 1.3.1, 1998.Google Scholar
  20. [Tur88]
    J. S. Turner. Almost All k-Colorable Graphs are Easy to Color. Journal of Algorithms, 9(1):63–82, March 1988.MATHCrossRefMathSciNetGoogle Scholar
  21. [VM96]
    B. Vance and D. Maier. Rapid Bushy Join-order Optimization with Cartesian Products. In Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pages 35–46, Montreal, Canada, June 1996.Google Scholar
  22. [Waa99]
    F. Waas. Cost Distributions in Symmetric Euclidean Traveling Salesman Problems-A Supplement to TSPLIB. Technical Report INS-R9911, CWI, Amsterdam, The Netherlands, September 1999.Google Scholar
  23. [WGL00]
    F. Waas and C. A. Galindo-Legaria. Counting, Enumerating and Sampling of Execution Plans in a Cost-Based Query Optimizer. In Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, Dallas, TX, USA, May 2000. Accepted for publication.Google Scholar
  24. [WP98]
    F. Waas and A. Pellenkoft. Exploiting Cost Distributions for Query Optimization. Technical Report INS-R9811, CWI, Amsterdam, The Netherlands, October 1998.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Florian Waas
    • 1
  • Arjan Pellenkoft
    • 1
  1. 1.CWIAmsterdamThe Netherlands
  2. 2.Universitá di BolognaBolognaItaly

Personalised recommendations