Advertisement

Automatic Index Selection in RDBMS by Exploring Query Execution Plan Space

  • Piotr Kołaczkowski
  • Henryk Rybiński
Part of the Studies in Computational Intelligence book series (SCI, volume 223)

Abstract

A novel approach to solving Index Selection Problem (ISP) is presented. In contrast to other known ISP approaches, our method searches the space of possible query execution plans, instead of searching the space of index configurations. An evolutionary algorithm is used for searching. The solution is obtained indirectly as the set of indexes used by the best query execution plans. The method has important features over other known algorithms: (1) it converges to the optimal solution, unlike greedy heuristics, which for performance reasons tend to reduce the space of candidate solutions, possibly discarding optimal solutions; (2) though the search space is huge and grows exponentially with the size of the input workload, searching the space of the query plans allows to direct more computational power to the most costly plans, thus yielding very fast convergence to “good enough” solutions; and (3) the costly reoptimization of the workload is not needed for calculating the objective function, so several thousands of candidates can be checked in a second. The algorithm was tested for large synthetic and real-world SQL workloads to evaluate the performace and scalability.

Keywords

Knapsack Problem Index Selection Query Optimizer Query Plan Evolution Strategy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Back, T., Hoffmeister, F., Schwefel, H.P.: A survey of evolution strategies. In: Proceedings of the Fourth International Conference on Genetic Algorithms, pp. 2–9. Morgan Kaufmann, San Francisco (1991)Google Scholar
  2. 2.
    Barcucci, E., Pinzani, R., Sprugnoli, R.: Optimal selection of secondary indexes. IEEE Trans. Softw. Eng. 16(1), 32–38 (1990), http://dx.doi.org/10.1109/32.44361 CrossRefMathSciNetGoogle Scholar
  3. 3.
    Bruno, N., Chaudhuri, S.: Automatic physical database tuning: a relaxation-based approach. In: SIGMOD 2005: Proceedings of the 2005 ACM SIGMOD international conference on Management of data, pp. 227–238. ACM Press, New York (2005), http://doi.acm.org/10.1145/1066157.1066184 CrossRefGoogle Scholar
  4. 4.
    Caprara, A., Fischetti, M., Maio, D.: Exact and approximate algorithms for the index selection problem in physical database design. IEEE Trans. on Knowl. and Data Eng. 7(6), 955–967 (1995), http://dx.doi.org/10.1109/69.476501 CrossRefGoogle Scholar
  5. 5.
    Caprara, A., González, J.J.S.: Separating lifted odd-hole inequalities to solve the index selection problem. Discrete Appl. Math. 92(2-3), 111–134 (1999), http://dx.doi.org/10.1016/S0166-218X9900050-5 Google Scholar
  6. 6.
    Caprara, A., Salazar, J.: A branch-and-cut algorithm for a generalization of the uncapacitated facility location problem. TOP 4(1), 135–163 (1996), citeseer.ist.psu.edu/caprara95branchcut.html zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Chan, A.Y.: Index selection in a self-adaptive relational data base management system. Tech. rep., Cambridge, MA, USA (1976)Google Scholar
  8. 8.
    Chaudhuri, S., Narasayya, V.R.: An efficient cost-driven index selection tool for Microsoft SQL Server. In: VLDB 1997: Proceedings of the 23rd International Conference on Very Large Data Bases, pp. 146–155. Morgan Kaufmann Publishers Inc., San Francisco (1997)Google Scholar
  9. 9.
    Choenni, S., Blanken, H.M., Chang, T.: Index selection in relational databases. In: International Conference on Computing and Information, pp. 491–496 (1993), citeseer.ist.psu.edu/choenni93index.html
  10. 10.
    Finkelstein, S., Schkolnick, M., Tiberio, P.: Physical database design for relational databases. ACM Trans. Database Syst. 13(1), 91–128 (1988), http://doi.acm.org/10.1145/42201.42205 CrossRefGoogle Scholar
  11. 11.
    Fotouhi, F., Galarce, C.E.: Genetic algorithms and the search for optimal database index selection. In: Proceedings of the The First Great Lakes Computer Science Conference on Computing in the 90’s, pp. 249–255. Springer, London (1991)Google Scholar
  12. 12.
    Ganek, A.G., Corbi, T.A.: The dawning of the autonomic computing era. IBM Syst. J. 42(1), 5–18 (2003)CrossRefGoogle Scholar
  13. 13.
    Glover, F.: Tabu search – part i. ORSA Journal on Computing 1(3), 190–206 (1989)zbMATHGoogle Scholar
  14. 14.
    Ip, M.Y.L., Saxton, L.V., Raghavan, V.V.: On the selection of an optimal set of indexes. IEEE Trans. Softw. Eng. 9(2), 135–143 (1983), http://dx.doi.org/10.1109/TSE.1983.236458 CrossRefGoogle Scholar
  15. 15.
    Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983)CrossRefMathSciNetGoogle Scholar
  16. 16.
    Kołaczkowski, P.: Compressing very large database workloads for continuous online index selection. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2008. LNCS, vol. 5181, pp. 791–799. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  17. 17.
    Kormilitsin, M., Chirkova, R., Fathi, Y., Stallman, M.: Plan-based view and index selection for query-performance improvement. Tech. Rep. 18, NC State University, Dept. of Computer Science (2008)Google Scholar
  18. 18.
    Kratica, J., Ljubić, I., Tošić, D.: A genetic algorithm for the index selection problem (2003), citeseer.ist.psu.edu/568873.html
  19. 19.
    Meyn, S.P., Tweedie, R.: Markov Chains and Stochastic Stability. Springer, Heidelberg (1993)zbMATHGoogle Scholar
  20. 20.
    Papadomanolakis, S., Ailamaki, A.: An integer linear programming approach to database design. In: Workshop on Self-Managing Database Systems (2007)Google Scholar
  21. 21.
    Papadomanolakis, S., Dash, D., Ailamaki, A.: Efficient use of the query optimizer for automated physical design. In: VLDB 2007: Proceedings of the 33rd international conference on Very large data bases, pp. 1093–1104. VLDB Endowment (2007)Google Scholar
  22. 22.
    Parr, T.: ANTLRv3: Another tool for language recognition (2003–2008), http://www.antlr.org/
  23. 23.
    Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education, London (2003), http://portal.acm.org/citation.cfm?id=773294 Google Scholar
  24. 24.
    Sattler, K.U., Schallehn, E., Geist, I.: Autonomous query-driven index tuning. In: IDEAS 2004: Proceedings of the International Database Engineering and Applications Symposium (IDEAS 2004), pp. 439–448. IEEE Computer Society Press, Los Alamitos (2004), http://dx.doi.org/10.1109/IDEAS.2004.15 CrossRefGoogle Scholar
  25. 25.
    Schnaitter, K., Abiteboul, S., Milo, T., Polyzotis, N.: Colt: continuous on-line tuning. In: SIGMOD 2006: Proceedings of the 2006 ACM SIGMOD international conference on Management of data, pp. 793–795. ACM Press, New York (2006), http://doi.acm.org/10.1145/1142473.1142592 CrossRefGoogle Scholar
  26. 26.
    Skelley, A.: DB2 advisor: An optimizer smart enough to recommend its own indexes. In: ICDE 2000: Proceedings of the 16th International Conference on Data Engineering, p. 101. IEEE Computer Society, Washington (2000)Google Scholar
  27. 27.
    Talebi, Z.A., Chirkova, R., Fathi, Y., Stallmann, M.: Exact and inexact methods for selecting views and indexes for olap performance improvement. In: EDBT 2008: Proceedings of the 11th international conference on Extending database technology, pp. 311–322. ACM, New York (2008), http://doi.acm.org/10.1145/1353343.1353383 Google Scholar
  28. 28.
    Transaction Performance Council: The TPC-H decision support benchmark (2001–2008), http://www.tpc.org/tpch/
  29. 29.
    Whang, K.-Y.: Index selection in relational databases. In: FODO, pp. 487–500 (1985)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Piotr Kołaczkowski
    • 1
  • Henryk Rybiński
    • 1
  1. 1.Warsaw University of Technology 

Personalised recommendations