Skip to main content
Log in

VEER: enhancing the interpretability of model-based optimizations

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Context:

Many software systems can be tuned for multiple objectives (e.g., faster runtime, less required memory, less network traffic or energy consumption, etc.). Such systems can suffer from “disagreement” where different models have different (or even opposite) insights and tactics on how to optimize a system. For configuration problems, we show that (a) model disagreement is rampant; yet (b) prior to this paper, it has barely been explored.

Objective:

We aim at helping practitioners and researchers better solve multi-objective configuration optimization problems, by resolving model disagreement.

Method:

We propose a dimension reduction method called VEER that builds a useful one-dimensional approximation to the original N-objective space. Traditional model-based optimizers use Pareto search to locate Pareto-optimal solutions to a multi-objective problem, which is computationally heavy on large-scale systems. VEER builds a surrogate that can replace the Pareto sorting step after deployment.

Results:

Compared to the prior state-of-the-art, for 11 configurable systems, VEER significantly reduces disagreement and execution time, without compromising the optimization performance in most cases. For our largest problem (with tens of thousands of possible configurations), optimizing with VEER finds as good or better optimizations with zero model disagreements, three orders of magnitude faster.

Conclusion:

When employing model-based optimizers for multi-objective optimization, we recommend to apply VEER, which not only improves the execution time, but also resolves the potential model disagreement problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Algorithm 2
Fig. 2
Fig. 3
Fig. 4
Algorithm 3
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. http://tiny.cc/gartners21

  2. Holland’s advice (Holland 1992) for genetic algorithms (such as NSGA-II and MOEA/D) is that 100 individuals need to be evolved over 100 generations; i.e., 104 evaluations in all.

  3. And the source code for that implementation can be found in the reproduction package mentioned in our abstract

  4. We define a concordant pair in tasks of more than 2 objectives in a similar manner: one configuration has better performance than the other in all objectives. This is not originally defined by Kendall, but we believe it is a proper extension.

References

  • Agrawal A, Menzies T, Minku LL, Wagner M, Yu Z (2020) Better software analytics via “duo”: data mining algorithms using/used-by optimizers. Empir Softw Eng 25(3):2099–2136

    Article  Google Scholar 

  • Antkiewicz M, Bąk K, Murashkin A, Olaechea R, Liang JH, Czarnecki K (2013) Clafer tools for product line engineering. In: Proceedings of the 17th international software product line conference co-located workshops, pp 130–135

  • Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. Adv Neural Inf Process Syst, vol 24

  • Bergstra J, Yamins D, Cox DD et al (2013) Hyperopt: a python library for optimizing the hyperparameters of machine learning algorithms. In: Proceedings of the 12th Python in science conference, Citeseer, vol 13, p 20

  • Brochu E, Cora VM, De Freitas N (2010) A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv:10122599

  • Chen D, Fu W, Krishna R, Menzies T (2018a) Applications of psychological science for actionable analytics. In: Proceedings of the 2018 26th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 456–467

  • Chen J, Nair V, Krishna R, Menzies T (2018b) “sampling” as a baseline optimizer for search-based software engineering. IEEE Trans Softw Eng (pre-print):1–1

  • Coello CAC, Sierra MR (2004) A study of the parallelization of a coevolutionary multi-objective evolutionary algorithm. In: Mexican international conference on artificial intelligence. Springer, pp 688–697

  • Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evolution Computat 6(2):182–197

    Article  Google Scholar 

  • Devanbu P, Zimmermann T, Bird C (2016) Belief & evidence in empirical software engineering. In: 2016 IEEE/ACM 38th international conference on software engineering (ICSE). IEEE, pp 108–119

  • Gigerenzer G (2008) Why heuristics work. Perspect Psycho Sci 3 (1):20–29

    Article  Google Scholar 

  • Golovin D, Solnik B, Moitra S, Kochanski G, Karro J, Sculley D (2017) Google vizier: a service for black-box optimization. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1487–1495

  • Guo J, Yang D, Siegmund N, Apel S, Sarkar A, Valov P, Czarnecki K, Wasowski A, Yu H (2018) Data-efficient performance learning for configurable systems. Empirical Softw Eng (EMSE) 23(3):1826–1867

    Article  Google Scholar 

  • Herodotou H, Lim H, Luo G, Borisov N, Dong L, Cetin FB, Babu S (2011) Starfish: a self-tuning system for big data analytics. In: Conference on innovative data systems research

  • Hess MR, Kromrey JD (2004) Robust confidence intervals for effect sizes: a comparative study of cohen’sd and cliff’s delta under non-normality and heterogeneous variances. In: Annual meeting of the American educational research association, pp 1–30

  • Holland JH (1992) Genetic algorithms. Sci Amer 267(1):66–73

    Article  Google Scholar 

  • Huband S, Hingston P, While L, Barone L (2003) An evolution strategy with probabilistic mutation for multi-objective optimisation. In: The 2003 congress on evolutionary computation, 2003. CEC’03., IEEE, vol 4, pp 2284–2291

  • Hutter F, Hoos HH, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: 5th LION

  • Jiarpakdee J, Tantithamthavorn C, Grundy J (2021) Practitioners’ perceptions of the goals and visual explanations of defect prediction models. arXiv:210212007

  • Kaltenecker C, Grebhahn A, Siegmund N, Apel S (2020) The interplay of sampling and machine learning for software performance prediction. IEEE Softw 37(4):58–66

    Article  Google Scholar 

  • Kendall MG (1948) Rank correlation methods

  • Kolesnikov S, Siegmund N, Kästner C, Grebhahn A, Apel S (2019) Tradeoffs in modeling performance of highly configurable software systems. Softw Syst Model 18(3):2265–2283

    Article  Google Scholar 

  • Laumanns M, Thiele L, Deb K, Zitzler E (2002) Combining convergence and diversity in evolutionary multiobjective optimization. Evolution Computat

  • Macbeth G, Razumiejczyk E, Ledesma RD (2011) Cliff’s delta calculator: a non-parametric effect size program for two groups of observations. Univ Psychol 10(2):545–555

    Article  Google Scholar 

  • Mittas N, Angelis L (2012) Ranking and clustering software cost estimation models through a multiple comparisons algorithm. IEEE Trans Softw Eng 39(4):537–551

    Article  Google Scholar 

  • Nair V, Menzies T, Siegmund N, Apel S (2017) Using bad learners to find good configurations. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering, pp 257–267

  • Nair V, Yu Z, Menzies T, Siegmund N, Apel S (2018) Finding faster configurations using flash. IEEE Trans Softw Eng 46(7):794–811

    Article  Google Scholar 

  • Sarkar A, Guo J, Siegmund N, Apel S, Czarnecki K (2015) Cost-efficient sampling for performance prediction of configurable systems (t). In: 2015 30th IEEE/ACM international conference on automated software engineering (ASE). IEEE, pp 342–352

  • Sawyer R (2011) Bi’s impact on analyses and decision making depends on the development of less complex applications. IJBIR 2:52–63. https://doi.org/10.4018/IJBIR.2011070104

    Google Scholar 

  • Shrikanth N, Menzies T (2020) Assessing practitioner beliefs about software defect prediction. In: 2020 IEEE/ACM 42nd international conference on software engineering: software engineering in practice (ICSE-SEIP). IEEE, pp 182–190

  • Siegmund N, Grebhahn A, Apel S, Kästner C (2015) Performance-influence models for highly configurable systems. In: Proceedings of the joint meeting on foundations of software engineering (ESEC/FSE), ACM, pp 284–294

  • Snoek J, Larochelle H, Adams R (2012) Practical bayesian optimization of machine learning algorithms. In: NIPS - volume 2

  • Song W, Chan FT (2015) Multi-objective configuration optimization for product-extension service. J Manuf Syst 37:113–125

    Article  Google Scholar 

  • Tan SY, Chan T (2016) Defining and conceptualizing actionable insight: a conceptual framework for decision-centric analytics. arXiv:160603510

  • Tu H, Papadimitriou G, Kiran M, Wang C, Mandal A, Deelman E, Menzies T (2021) Mining workflows for anomalous data transfers. In: 2021 IEEE/ACM 18th international conference on mining software repositories (MSR), pp 1–12. https://doi.org/10.1109/MSR52588.2021.00013

  • Van Aken D, Pavlo A, Gordon GJ, Zhang B (2017) Automatic database management system tuning through large-scale machine learning. In: International conference on management of data, ACM

  • Van Veldhuizen DA (1999) Multiobjective evolutionary algorithms: classifications, analyses and new innovations. Tech rep, Air force inst of tech wright-pattersonafb oh school of engineering

  • Xia T, Krishna R, Chen J, Mathew G, Shen X, Menzies T (2018) Hyperparameter optimization for effort estimation. arXiv:180500336

  • Xu T, Jin L, Fan X, Zhou Y, Pasupathy S, Talwadker R (2015) Hey, you have given me too many knobs!: understanding and dealing with over-designed configuration in system software. In: Foundations of software engineering

  • Zhang Q, Li H (2007) Moea/d: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evolution Computat 11(6):712–731

    Article  Google Scholar 

  • Zhu H, Jin J, Tan C, Pan F, Zeng Y, Li H, Gai K (2017) Optimized cost per click in taobao display advertising. arXiv preprint

  • Zitzler E, Laumanns M, Thiele L (2001) Spea2: improving the strength pareto evolutionary algorithm. TIK-Report, vol 103

  • Zuluaga M, Krause A, Püschel M (2016) ε-pal: an active learning approach to the multi-objective optimization problem. J Mach Learn Res 17(1):3619–3650

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was partially funded by a research grant from the Laboratory for Analytical Sciences, North Carolina State University. Apel’s work has been funded by the German Research Foundation (AP 206/11 and Grant 389792660 as part of TRR 248 – CPEC. Siegmunds work has been supported by the Federal Ministry of Education and Research of Germany and by the Sächsische Staatsministerium für Wissenschaft Kultur und Tourismus in the program Center of Excellence for AI-research “Center for Scalable Data Analytics and Artificial Intelligence Dresden/Leipzig”, project identification number: ScaDS.AI, and by the German Research Foundation (SI 2171/2-2).

Funding

Apart from the funding acknowledged above, this work does not have any other conflicts of interests.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kewen Peng.

Ethics declarations

Conflict of Interests

We assert that the authors have no conflict of interest.

Additional information

Communicated by: Erik Linstead

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, K., Kaltenecker, C., Siegmund, N. et al. VEER: enhancing the interpretability of model-based optimizations. Empir Software Eng 28, 61 (2023). https://doi.org/10.1007/s10664-023-10296-w

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-023-10296-w

Keywords

Navigation