Mjolnir: A framework agnostic auto-tuning system with deep reinforcement learning

Ben Slimane, Nourchene; Sagaama, Houssem; Marwani, Maher; Skhiri, Sabri

doi:10.1007/s10489-022-03956-9

Mjolnir: A framework agnostic auto-tuning system with deep reinforcement learning

Published: 20 October 2022

Volume 53, pages 14008–14022, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Nourchene Ben Slimane ORCID: orcid.org/0000-0003-0434-310X¹,
Houssem Sagaama¹,
Maher Marwani¹ &
…
Sabri Skhiri²

290 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Choosing the right setting for big data frameworks is an important yet difficult task. These frameworks come with a complex set of parameters that need to be tuned to achieve the best performance in terms of throughput and latency. Learning-based auto-tuning methods using traditional machine learning models might not be effective for the task because they require huge amounts of high-quality training data, which is time-consuming and very expensive. A good alternative would be to consider reinforcement learning methods to train an intelligent agent through trial and error. In this context, we propose a framework-agnostic auto-tuning system implementing an actor-critic algorithm namely TD3 (Twin Delayed Deep Deterministic Policy Gradient). We show that the agent can find an optimal configuration in a continuous high-dimensional search space with a limited number of steps. We conducted extensive experiments on Apache Spark, under different workloads from the HiBench, TPC-DS and TPC-H benchmarking tools. In this paper, we give a detailed representation of the reinforcement learning environment and show the best design through experiments. Results showed that our approach outperforms the state-of-the-art tuning methods and can improve the performance of spark workloads over the default configurations by up to \(\sim 77\%\) with an average of \(\sim 45\%\). It also showed a promising adaptation behaviour to workload variation during evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A practical guide to multi-objective reinforcement learning and planning

Article Open access 13 April 2022

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Deep learning: systematic review, models, challenges, and research directions

Article Open access 07 September 2023

References

Petridis P, Gounaris A, Torres J (2017) Spark parameter tuning via trial-and-error. In: Angelov, P, Manolopoulos, Y, Iliadis, L, Roy, A, Vellasco, M. (eds.) Advances in Big Data. Springer, Cham. https://doi.org/10.1007/978-3-319-47898-2_24, pp 226–237
Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. MIT Press, Cambridge, MA, USA. http://www.deeplearningbook.org. Accessed 8 Feb 2022
Zhao X, Yin J, Chen Z, He S (2013) Workload classification model for specializing virtual machine operating system. In: 2013 IEEE Sixth international conference on cloud computing. https://doi.org/10.1109/CLOUD.2013.144, pp 343–350
Aken DV, Pavlo A, Gordon GJ, Zhang B (2017) Automatic database management system tuning through large-scale machine learning. Proceedings of the ACM SIGMOD international conference on management of data part F1277, pp 1009–1024. https://doi.org/10.1145/3035918.3064029
Du H, Han P, Xiang Q, Huang S (2020) Monkeyking: Adaptive parameter tuning on big data platforms with deep reinforcement learning. Big Data 8(4):270–290. https://doi.org/10.1089/big.2019.0123
Article Google Scholar
Floratou A, Agrawal A, Graham B, Rao S, Ramasamy K (2017) Dhalion: Self-regulating stream processing in heron. Proc VLDB Endow 10(12):1825–1836. https://doi.org/10.14778/3137765.3137786
Article Google Scholar
Liu X, Dastjerdi AV, Calheiros RN, Qu C, Buya R (2017) A stepwise auto-profiling method for performance optimization of streaming applications. ACM Trans Auton Adapt Syst 12(4):1–3. https://doi.org/10.1145/3132618
Article Google Scholar
Bilal M, Canini M (2017) Towards Automatic Parameter Tuning of Stream Processing Systems. In: Proceedings of the 2017 symposium on cloud computing. SoCC ’17, pp 189–200. Association for Computing Machinery, New York. https://doi.org/10.1145/3127479.3127492
Trotter M, Liu G, Wood T (2017) Into the storm: Descrying optimal configurations using genetic algorithms and bayesian optimization, pp 175–180. https://doi.org/10.1109/FAS-W.2017.144
Zacheilas N, Maroulis S, Priovolos T, Kalogeraki V, Gunopulos D (2018) Dione: A framework for automatic profiling and tuning big data Applications. In: Proceedings - IEEE 34th international conference on data engineering, ICDE 2018. https://doi.org/10.1109/ICDE.2018.00195, pp 1637–1640
Kalim F, Cooper T, Wu H, Li Y, Wang N, Lu N, Fu M, Qian X, Luo H, Cheng D, Wang Y, Dai F, Ghosh M, Wang B (2019) Caladrius: A performance modelling service for distributed stream processing systems, pp 1886–1897. https://doi.org/10.1109/ICDE.2019.00204
Ahmed N, Barczak ALC, Rashid MA, Susnjak T (2021) An enhanced parallelisation model for performance prediction of apache spark on a multinode hadoop cluster. Big Data and Cognitive Computing 5. https://doi.org/10.3390/bdcc5040065
Chen Y, Lu J, Chen C, Hoque M, Tarkoma S (2019) Cost-effective resource provisioning for spark workloads, pp 2477–2480. Association for Computing Machinery, New York. https://doi.org/10.1145/3357384.3358090
Singhal R, Singh P (2018) Performance assurance model for applications on spark platform. In: Nambiar, R, Poess, M. (eds.) Performance Evaluation and Benchmarking for the Analytics Era, pp 131–146. Springer, Cham. https://doi.org/10.1007/978-3-319-72401-0_10
Gounaris A, Torres J (2018) A methodology for spark parameter tuning. Big Data Research 11:22–32. https://doi.org/10.1016/j.bdr.2017.05.001
Article Google Scholar
Bao L, Liu X, Chen W (2018) Learning-based automatic parameter tuning for big data analytics frameworks. arXiv:1808.06008. https://doi.org/10.1109/BigData.2018.8622018
Zhu Y, Liu J, Guo M, Bao Y, Ma W, Liu Z, Song K, Yang Y (2017) Bestconfig: Tapping the performance potential of systems via automatic configuration tuning. In: Proceedings of the 2017 symposium on cloud computing. SoCC ’17, pp 338–350. Association for Computing Machinery, New York. https://doi.org/10.1145/3127479.3128605
Kumar S, Padakandla S, Chandrashekar L, Parihar P, Gopinath K, Bhatnagar S (2017) Scalable Performance Tuning of Hadoop MapReduce: A Noisy Gradient Approach. In: 2017 IEEE 10th International conference on cloud computing (CLOUD), pp 375–382. https://doi.org/10.1109/CLOUD.2017.55
Filho ERL, de Almeida EC, Scherzinger S, Herodotou H (2021) Investigating automatic parameter tuning for sql-on-hadoop systems. Big Data Research 25:100204. https://doi.org/10.1016/j.bdr.2021.100204
Article Google Scholar
Chen Y, Goetsch P, Hoque MA, Lu J, Tarkoma S (2022) d-simplexed: Adaptive delaunay triangulation for performance modeling and prediction on big data analytics. IEEE Transactions on Big Data 8:458–469. https://doi.org/10.1109/TBDATA.2019.2948338
Google Scholar
Wang H, Rafatirad S, Homayoun H (2019) A+ tuning: architecture+application auto-tuning for in-memory data-processing frameworks. In: 2019 IEEE 25th International conference on parallel and distributed systems (ICPADS), pp 163–166. https://doi.org/10.1109/ICPADS47876.2019.00032
Zhang J, Liu Y, Zhou K, Li G, Xiao Z, Cheng B, Xing J, Wang Y, Cheng T, Liu L, Ran M, Li Z (2019) An end-to-end automatic cloud database tuning system using deep reinforcement learning. In: Proceedings of the 2019 International conference on management of data. SIGMOD ’19, pp 415–432. Association for computing machinery, New York. https://doi.org/10.1145/3299869.3300085
Bitsakos C, Konstantinou I, Koziris N (2018) DERP: A deep reinforcement learning cloud system for elastic resource provisioning. In: Proceedings of the international conference on cloud computing technology and science, CloudCom, vol 2018-Decem, pp 21–29. https://doi.org/10.1109/CloudCom2018.2018.00020
Li G, Zhou X, Li S, Gao B (2019) QTune: A query-aware database tuning system with deep reinforcement learning. Proc VLDB Endow 12(12):2118–2130. https://doi.org/10.14778/3352063.3352129
Article Google Scholar
Sutton RS, Barto AG (2018) Reinforcement learning: An Introduction. A Bradford Book, Cambridge, MA, USA. https://doi.org/10.5555/3312046
Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A, Chen Y, Lillicrap T, Hui F, Sifre L, van den Driessche G, Graepel T, Hassabis D (2017) Mastering the game of Go without human knowledge. Nature 550(7676):354–359. https://doi.org/10.1038/nature24270
Article Google Scholar
Azhikodan AR, Bhat AGK, Jadhav MV (2019) Stock trading bot using deep reinforcement learning. In: Saini, HS, Sayal, R, Govardhan, A, Buyya, R, (eds.) Innovations in computer science and engineering, pp 41–49. Springer, Singapore
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533. https://doi.org/10.1038/nature14236
Article Google Scholar
Schaul T, Quan J, Antonoglou I, Silver D (2016) Prioritized experience replay. In: International conference on learning representations, Puerto Rico
Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2016) Continuous control with deep reinforcement learning. In: International conference on learning representations, puerto rico
Gaskett C, Wettergreen D, Zelinsky A (1999) Q-learning in continuous state and action spaces. In: Foo, N. (ed.) Advanced topics in artificial intelligence, pp 417–428. Springer, Berlin, Heidelberg
Fujimoto S, van Hoof H, Meger D (2018) Addressing function approximation error in actor-critic methods. In: Dy, JG, Krause, A (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML. Proceedings of machine learning research, vol 80, pp 1582–1591. PMLR, Stockholmsmässan, Stockholm, Sweden. http://proceedings.mlr.press/v80/fujimoto18a.html. Accessed 23 Aug 2021
van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double Q-Learning AAAI press
Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W (2016) Openai gym. CoRR:1606.01540. arXiv:https://arxiv.org/abs/1606.01540
Wang H-n, Liu N, Zhang Y-y, Feng D-w, Huang F, Li D-s, Zhang Y-m (2020) Deep reinforcement learning: a survey. Frontiers of Information Technology and Electronic Engineering 21:1726–1744. https://doi.org/10.1631/FITEE.1900533
Article Google Scholar
Morgan AS, Nandha D, Chalvatzaki G, D’Eramo C, Dollar AM, Peters J (2021) Model predictive actor-critic: Accelerating robot skill acquisition with deep reinforcement learning. 2021 IEEE International Conference on Robotics and Automation (ICRA), pp 6672–6678
Wong C-C, Chien S-Y, Feng H-M, Aoyama H (2021) Motion planning for dual-arm robot based on soft actor-critic. IEEE Access 9:26871–26885
Article Google Scholar
Pantoja-Garcia L, Garcia-Rodriguez R, Parra-Vega V (2022) Adaptive actor-critic with integral sliding manifold for learning control of robots. In: Moreno, HA, Carrera, IG, Ramírez-Mendoza, RA, Baca, J, Banfield, IA (eds.) Advances in automation and robotics research, vol 347, pp 101–108. Springer, Cham. https://doi.org/10.1007/978-3-030-90033-5_12
Archetti F, Candelieri A (2019) The surrogate model, pp 37–56. Springer, Cham. https://doi.org/10.1007/978-3-030-24494-1_3
Archetti F, Candelieri A (2019) The acquisition function, pp 57–72. Springer, Cham. https://doi.org/10.1007/978-3-030-24494-1_4
Bahmani B, Moseley B, Vattani A, Kumar R, Vassilvitskii S (2012) Scalable k-means++. Proc VLDB Endow 5(7):622–633. https://doi.org/10.14778/2180912.2180915
Article Google Scholar
Alipourfard O, Liu HH, Chen J, Venkataraman S, Yu M, Zhang M (2017) Cherrypick: Adaptively unearthing the best cloud configurations for big data analytics. In: 14th USENIX symposium on networked systems design and implementation (NSDI 17), pp 469–482. USENIX Association, Boston, MA. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/alipourfard. Accessed 17 Sep 2021

Download references

Funding

this work is supported and funded by the Walloon region, Belgium.

Author information

Authors and Affiliations

R&D Department, EURA NOVA, Tunis, Tunisia
Nourchene Ben Slimane, Houssem Sagaama & Maher Marwani
R&D Department, EURA NOVA, Mont-Saint-Guibert, Belgium
Sabri Skhiri

Authors

Nourchene Ben Slimane
View author publications
You can also search for this author in PubMed Google Scholar
Houssem Sagaama
View author publications
You can also search for this author in PubMed Google Scholar
Maher Marwani
View author publications
You can also search for this author in PubMed Google Scholar
Sabri Skhiri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nourchene Ben Slimane.

Ethics declarations

Conflict of Interests/Competing Interests

The Authors declare that there is no conflict of interest

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ben Slimane, N., Sagaama, H., Marwani, M. et al. Mjolnir: A framework agnostic auto-tuning system with deep reinforcement learning. Appl Intell 53, 14008–14022 (2023). https://doi.org/10.1007/s10489-022-03956-9

Download citation

Accepted: 02 July 2022
Published: 20 October 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s10489-022-03956-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mjolnir: A framework agnostic auto-tuning system with deep reinforcement learning

Abstract

Access this article

Similar content being viewed by others

A practical guide to multi-objective reinforcement learning and planning

Multi-agent deep reinforcement learning: a survey

Deep learning: systematic review, models, challenges, and research directions

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests/Competing Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Mjolnir: A framework agnostic auto-tuning system with deep reinforcement learning

Abstract

Access this article

Similar content being viewed by others

A practical guide to multi-objective reinforcement learning and planning

Multi-agent deep reinforcement learning: a survey

Deep learning: systematic review, models, challenges, and research directions

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests/Competing Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation