Skip to main content

Regularized Cost-Model Oblivious Database Tuning with Reinforcement Learning

  • Chapter
  • First Online:
Transactions on Large-Scale Data- and Knowledge-Centered Systems XXVIII

Abstract

In this paper, we propose a learning approach to adaptive performance tuning of database applications. The objective is to validate the opportunity to devise a tuning strategy that does not need prior knowledge of a cost model. Instead, the cost model is learned through reinforcement learning. We instantiate our approach to the use case of index tuning. We model the execution of queries and updates as a Markov decision process whose states are database configurations, actions are configuration changes, and rewards are functions of the cost of configuration change and query and update evaluation. During the reinforcement learning process, we face two important challenges: the unavailability of a cost model and the size of the state space. To address the former, we iteratively learn the cost model, in a principled manner, using regularization to avoid overfitting. To address the latter, we devise strategies to prune the state space, both in the general case and for the use case of index tuning. We empirically and comparatively evaluate our approach on a standard OLTP dataset. We show that our approach is competitive with state-of-the-art adaptive index tuning, which is dependent on a cost model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    By convergence we mean the first stable patch in Fig. 1 after the series of high spikes, around the 500\(^{th}\) query. The convergence point is qualitatively chosen by observing characteristics of the curve.

References

  1. Agrawal, S., Chaudhuri, S., Narasayya, V.R.: Automated selection of materialized views and indexes in sql databases. In: Proceedings of the 26th International Conference on Very Large Data Bases (VLDB 2000), pp. 496–505 (2000)

    Google Scholar 

  2. Agrawal, S., Narasayya, V., Yang, B.: Integrating vertical and horizontal partitioning into automated physical database design. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD 2004), pp. 359–370 (2004)

    Google Scholar 

  3. Alagiannis, I., Idreos, S., Ailamaki, A.: H2o: a hands-free adaptive store. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD 2014) (2014)

    Google Scholar 

  4. Audibert, J.Y., Munos, R., Szepesvári, C.: Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theoret. Comput. Sci. 410(19), 1876–1902 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  5. Azefack, S., Aouiche, K., Darmont, J.: Dynamic index selection in data warehouses. CoRR abs/0809.1965 (2008). http://arXiv.org/abs/0809.1965

  6. Basu, D., Lin, Q., Chen, W., Vo, H.T., Yuan, Z., Senellart, P., Bressan, S.: Cost-model oblivious database tuning with reinforcement learning. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H. (eds.) DEXA 2015. LNCS, vol. 9262, pp. 253–268. Springer, Heidelberg (2015). doi:10.1007/978-3-319-22849-5_18

    Chapter  Google Scholar 

  7. Benedikt, M., Bohannon, P., Bruns, G.: Data cleaning for decision support. In: Proceedings of the 1st International VLDB Workshop on Clean Databases (CleanDB 2006) (2006)

    Google Scholar 

  8. Borodin, A., El-Yaniv, R.: Online Computation and Competitive Analysis. Cambridge University Press, Cambridge (1998)

    MATH  Google Scholar 

  9. Bouchakri, R., Bellatreche, L., Hidouci, K.-W.: Static and incremental selection of multi-table indexes for very large join queries. In: Morzy, T., Valduriez, P., Bellatreche, L. (eds.) ADBIS 2015. LNCS, vol. 9282, pp. 43–56. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33074-2_4

    Google Scholar 

  10. Bruno, N., Chaudhuri, S.: An online approach to physical design tuning. In: Proceedings of the 23th IEEE International Conference on Data Engineering (ICDE 2007), pp. 826–835 (2007)

    Google Scholar 

  11. Bruno, N., Chaudhuri, S.: Constrained physical design tuning. Proc. VLDB Endow. 1(1), 4–15 (2008)

    Article  Google Scholar 

  12. Bruno, N., Chaudhuri, S.: Interactive physical design tuning. In: Proceedings of the 26th IEEE International Conference on Data Engineering (ICDE 2010), pp. 1161–1164 (2010)

    Google Scholar 

  13. Bruno, N., Nehme, R.V.: Configuration-parametric query optimization for physical design tuning. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD 2008), pp. 941–952 (2008)

    Google Scholar 

  14. Chaudhuri, S., Narasayya, V.: Autoadmin: what-if index analysis utility. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data (SIGMOD 1998), pp. 367–378 (1998)

    Google Scholar 

  15. Difallah, D.E., Pavlo, A., Curino, C., Cudre-Mauroux, P.: Oltp-bench: an extensible testbed for benchmarking relational databases. Proc. VLDB Endow. 7(4), 277–288 (2013)

    Article  Google Scholar 

  16. Gouriten, G., Maniu, S., Senellart, P.: Scalable, generic, and adaptive systems for focused crawling. In: Proceedings of the 25th ACM Conference on Hypertext and Social Media (HT 2014), pp. 35–45 (2014)

    Google Scholar 

  17. Hammer, M., Niamir, B.: A heuristic approach to attribute partitioning. In: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data (SIGMOD 1979), pp. 93–101 (1979)

    Google Scholar 

  18. Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. J. Mach. Learn. Res. 4, 1107–1149 (2003)

    MathSciNet  MATH  Google Scholar 

  19. Lai, T.L., Wei, C.Z.: Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems. Ann. Stat. 154–166 (1982)

    Google Scholar 

  20. LeFevre, F., Sankaranarayanan, J., Hacigumus, H., Tatemura, J., Polyzotis, N., Carey, M.J.: Exploiting opportunistic physical design in large-scale data analytics. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD 2014) (2014)

    Google Scholar 

  21. Li, L., Gruenwald, L.: Self-managing online partitioner for databases (smopd): a vertical database partitioning system with a fully automatic online approach. In: Proceedings of the 17th International Database Engineering and Applications Symposium (IDEAS 2013), pp. 168–173 (2013)

    Google Scholar 

  22. Lightstone, S., Bhattacharjee, B.: Automated design of multidimensional clustering tables for relational databases. In: Proceedings of the 30th International Conference on Very Large Data Bases (VLDB 2004), pp. 1170–1181 (2004)

    Google Scholar 

  23. Lohman, G.M.: Is query optimization a “solved” problem? (2014). http://wp.sigmod.org/?p=1075

  24. Luhring, M., Sattler, K.U., Schmidt, K., Schallehn, E.: Autonomous management of soft indexes. In: Proceedings of the 2nd International Workshop on Self-Managing Data Bases (SMDB 2007), pp. 450–458 (2007)

    Google Scholar 

  25. Malik, T., Wang, X., Dash, D., Chaudhary, A., Ailamaki, A., Burns, R.: Adaptive physical design for curated archives. In: Ailamaki, A., Bowers, S. (eds.) SSDBM 2012. LNCS, vol. 7338, pp. 148–166. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02279-1_11

    Chapter  Google Scholar 

  26. Nielsen, F., Bhatia, R.: Matrix Information Geometry. Springer, Heidelberg (2013)

    Book  MATH  Google Scholar 

  27. Papadomanolakis, S., Dash, D., Ailamaki, A.: Efficient use of the query optimizer for automated physical design. In: Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB 2007), pp. 1093–1104 (2007)

    Google Scholar 

  28. Powell, W.B.: Approximate Dynamic Programming: Solving the Curses of Dimensionality. Wiley-Interscience, Hoboken (2007)

    Book  MATH  Google Scholar 

  29. Puterman, M.L.: Markov Decision Processes Discrete Stochastic Dynamic Programming, vol. 414. Wiley, Hoboken (2009)

    MATH  Google Scholar 

  30. Raab, F.: TPC-C - the standard benchmark for online transaction processing (OLTP). In: Gray, J. (ed.) The Benchmark Handbook. Morgan Kaufmann, Burlington (1993)

    Google Scholar 

  31. Ramakrishnan, R., Gehrke, J., Gehrke, J.: Database Management Systems, vol. 3. McGraw-Hill, New York (2003)

    MATH  Google Scholar 

  32. Rao, J., Zhang, C., Megiddo, N., Lohman, G.: Automating physical database design in a parallel database. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD 2002), pp. 558–569 (2002)

    Google Scholar 

  33. Rasin, A., Zdonik, S.: An automatic physical design tool for clustered column-stores. In: Proceedings of the 16th International Conference on Extending Database Technology (EDBT 2013), pp. 203–214 (2013)

    Google Scholar 

  34. Rieser, V., Robinson, D.T., Murray-Rust, D., Rounsevell, M.: A comparison of genetic algorithms and reinforcement learning for optimising sustainable forest management. GeoComputation (2011)

    Google Scholar 

  35. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (2015)

    MATH  Google Scholar 

  36. Rösch, P., Dannecker, L., Färber, F., Hackenbroich, G.: A storage advisor for hybrid-store databases. Proc. VLDB Endow. 5(12), 1748–1758 (2012)

    Article  Google Scholar 

  37. Schnaitter, K., Polyzotis, N.: A benchmark for online index selection. In: 2009 IEEE 25th International Conference on Data Engineering, pp. 1701–1708, March 2009

    Google Scholar 

  38. Schnaitter, K., Abiteboul, S., Milo, T., Polyzotis, N.: On-line index selection for shifting workloads. In: Proceedings of the 2nd International Workshop on Self-Managing Data Bases (SMDB 2007), pp. 459–468 (2007)

    Google Scholar 

  39. Schnaitter, K., Polyzotis, N.: Semi-automatic index tuning: keeping dbas in the loop. Proc. VLDB Endow. 5(5), 478–489 (2012)

    Article  Google Scholar 

  40. Stillger, M., Lohman, G.M., Markl, V., Kandil, M.: LEO - DB2’s LEarning Optimizer. In: VLDB (2001)

    Google Scholar 

  41. Sutton, R.S., Barto, A.G.: Reinforcement Learning. MIT Press, Cambridge (1998)

    Google Scholar 

  42. Warmuth, M.K., Jagota, A.K.: Continuous and discrete-time nonlinear gradient descent: relative loss bounds and convergence. In: Electronic proceedings of the 5th International Symposium on Artificial Intelligence and Mathematics. Citeseer (1997)

    Google Scholar 

  43. White, D.J.: Markov Decision Processes. Wiley, New York (1993)

    MATH  Google Scholar 

  44. Young, P.: Recursive least squares estimation. In: Recursive Estimation and Time-Series Analysis, pp. 29–46. Springer, Berlin, Heidelberg (2011)

    Google Scholar 

  45. Zilio, D.C., Zuzarte, C., Lightstone, S., Ma, W., Lohman, G.M., Cochrane, R., Pirahesh, H., Colby, L.S., Gryz, J., Alton, E., Liang, D., Valentin, G.: Recommending materialized views and indexes with IBM DB2 design advisor. In: Proceedings of the 1st International Conference on Autonomic Computing (ICAC 2004), pp. 180–188 (2004)

    Google Scholar 

Download references

Acknowledgement

We thank Prof. Haibo Chen for valuable feedback on this work. This research is funded by the National Research Foundation Singapore under its Campus for Research Excellence and Technological Enterprise (CREATE) programme with the SP2 project of the Energy and Environmental Sustainability Solutions for Megacities – E2S2 programme.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pierre Senellart .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Basu, D. et al. (2016). Regularized Cost-Model Oblivious Database Tuning with Reinforcement Learning. In: Hameurlain, A., Küng, J., Wagner, R., Chen, Q. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XXVIII. Lecture Notes in Computer Science(), vol 9940. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53455-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-53455-7_5

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-53454-0

  • Online ISBN: 978-3-662-53455-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics