Skip to main content
Log in

SmartIX: A database indexing agent based on reinforcement learning

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Configuring databases for efficient querying is a complex task, often carried out by a database administrator. Solving the problem of building indexes that truly optimize database access requires a substantial amount of database and domain knowledge, the lack of which often results in wasted space and memory for irrelevant indexes, possibly jeopardizing database performance for querying and certainly degrading performance for updating. In this paper, we develop the SmartIX architecture to solve the problem of automatically indexing a database by using reinforcement learning to optimize queries by indexing data throughout the lifetime of a database. We train and evaluate SmartIX performance using TPC-H, a standard, and scalable database benchmark. Our empirical evaluation shows that SmartIX converges to indexing configurations with superior performance compared to standard baselines we define and other reinforcement learning methods used in related work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. TPC: http://www.tpc.org/

References

  1. Amado L, Meneguzzi F (2017) Reinforcement learning applied to RTS games. In: Proceedings of the 2017 workshop on adaptive learning agents (ALA), pp 1–8. ALA 2017, São Paulo. Brazil

  2. Basu D, Lin Q, Chen W, Vo H T, Yuan Z, Senellart P, Bressan S (2016) Regularized cost-model oblivious database tuning with reinforcement learning. In: Transactions on large-scale data-and knowledge-centered systems XXVIII. Springer, Berlin, pp 96–132

  3. Bellman R (1978) An introduction to artificial intelligence: Can computers think? Boyd & Fraser Pub. Co The University of Michigan

  4. DB Group at UCSC: DBTune (2019). https://github.com/dbgroup-at-ucsc/dbtune

  5. Duan S, Thummala V, Babu S (2009) Tuning database configuration parameters with ituned. Proceedings of the VLDB Endowment 2(1):1246–1257

    Article  Google Scholar 

  6. Elfayoumy S, Patel J (2012) Database performance monitoring and tuning using intelligent agent assistants. In: Proceedings of the International Conference on Information and Knowledge Engineering (IKE), pp 1. The steering committee of the world congress in computer science, Computer …

  7. EnterpriseDB: Enterprise Database (2019). https://www.enterprisedb.com

  8. Giardino C, Paternoster N, Unterkalmsteiner M, Gorschek T, Abrahamsson P (2016) Software development in startup companies: The greenfield startup model. IEEE Trans Softw Eng 42(6):585–604

    Article  Google Scholar 

  9. Kraska T, Beutel A, Chi E H, Dean J, Polyzotis N (2018) The case for learned index structures. In: Proceedings of the 2018 international conference on management of data, pp 489–504. ACM

  10. Licks GP, Couto J, Meneguzzi F, Ruiz D, de Fátima Miehe P, de Paris R (2019) mir-pucrs/smartix-rl v1.0. https://doi.org/10.5281/zenodo.3254967

  11. Marcus R, Papaemmanouil O (2018) Deep reinforcement learning for join order enumeration. arXiv:1803.00055, 1–7

  12. Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller MA (2013) Playing atari with deep reinforcement learning. arXiv:1312.5602, 9

  13. Neuhaus P, Couto J, Wehrmann J, Ruiz D, Meneguzzi F (2019) GADIS: A genetic algorithm for database index selection. In: The 31st international conference on software engineering and knowledge engineering (SEKE), pp 39–42. https://doi.org/10.18293/SEKE2019-135

  14. Olofson C W (2018) Ensuring a fast, reliable, and secure database through automation: Oracle autonomous database. White paper, IDC Corporate USA, Sponsored by: Oracle Corp

  15. Ortiz J, Balazinska M, Gehrke J, Keerthi SS (2018) Learning state representations for query optimization with deep reinforcement learning. arXiv:1803.08604, 1–5

  16. Pavlo A, Angulo G, Arulraj J, Lin H, Lin J, Ma L, Menon P, Mowry T C, Perron M, Quah I et al (2017) Self-driving database management systems. In: Conference on innovative data systems research (CIDR). CIDR, Chaminade, California, pp 1–6

  17. Pedrozo W G, Nievola J C, Ribeiro D C (2018) An adaptive approach for index tuning with learning classifier systems on hybrid storage environments. In: International conference on hybrid artificial intelligence systems, pp 716–729. Springer

  18. Petraki E, Idreos S, Manegold S (2015) Holistic indexing in main-memory column-stores. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, SIGMOD ’15. https://doi.org/10.1145/2723372.2723719. ACM, New York, pp 1153–1166

  19. Popovic J (2017) Automatic tuning - SQL Server. https://docs.microsoft.com/en-us/sql/relational-databases/automatic-tuning/automatic-tuning. Accessed: 2019-06-17

  20. POWA (2019) PostreSQL workload analyzer. http://powa.readthedocs.io

  21. Ramakrishnan R, Gehrke J (2003) Database management systems, 3rd edn. McGraw-Hill, Inc., New York

    MATH  Google Scholar 

  22. Sharma A, Schuhknecht FM, Dittrich J (2018) The case for automatic database administration using deep reinforcement learning. arXiv:1801.05643, 1–9

  23. Sutton R S, Barto A G (2018) Reinforcement learning: An introduction. MIT Press, Cambridge

    MATH  Google Scholar 

  24. Sutton R S, Barto A G (2018) Reinforcement learning: An introduction, chap. Introduction. MIT Press, Cambridge, pp 1–13

    MATH  Google Scholar 

  25. Sutton R S, Barto A G (2018) Reinforcement learning: An introduction, chap. Applications and case studies. MIT Press, Cambridge, pp 421–453

    Google Scholar 

  26. Sutton R S, Barto A G (2018) Reinforcement learning: An introduction, chap. Temporal-difference learning. MIT Press, Cambridge, pp 119–138

    Google Scholar 

  27. Thanopoulou A, Carreira P, Galhardas H (2012) Benchmarking with TPC-H on off-the-shelf hardware. In: 14th international conference on enterprise information systems. Springer, Wroclaw, pp 205–208

  28. TPC: Transaction performance council website (TPC) (1998) http://www.tpc.org/

  29. Tsitsiklis J N, Roy B V (1996) An analysis of temporal-difference learning with function approximation. Tech. rep., Report LIDS-P-2322). Laboratory for information and decision systems, Massachusetts Institute of Technology

  30. Van Aken D, Pavlo A, Gordon G J, Zhang B (2017) Automatic database management system tuning through large-scale machine learning. In: Proceedings of the 2017 ACM international conference on management of data, pp 1009–1024. ACM

  31. Wang J, Liu W, Kumar S, Chang S F (2016) Learning to hash for indexing big data—a survey. Proc IEEE 104(1):34–57

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by SAP SE. We thank our colleagues from SAP Labs Latin America who provided insights and expertise that greatly assisted the research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gabriel Paludo Licks.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 A Index configurations

The corresponding indexed columns in each of the configurations we use in our experiments are shown in Table 3.

Table 3 Indexing configuration for all the baselines and SmartIX

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Paludo Licks, G., Colleoni Couto, J., de Fátima Miehe, P. et al. SmartIX: A database indexing agent based on reinforcement learning. Appl Intell 50, 2575–2588 (2020). https://doi.org/10.1007/s10489-020-01674-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01674-8

Keywords

Navigation