Empirical Software Engineering

, Volume 18, Issue 3, pp 594–623 | Cite as

Parameter tuning or default values? An empirical investigation in search-based software engineering

  • Andrea Arcuri
  • Gordon FraserEmail author


Many software engineering problems have been addressed with search algorithms. Search algorithms usually depend on several parameters (e.g., population size and crossover rate in genetic algorithms), and the choice of these parameters can have an impact on the performance of the algorithm. It has been formally proven in the No Free Lunch theorem that it is impossible to tune a search algorithm such that it will have optimal settings for all possible problems. So, how to properly set the parameters of a search algorithm for a given software engineering problem? In this paper, we carry out the largest empirical analysis so far on parameter tuning in search-based software engineering. More than one million experiments were carried out and statistically analyzed in the context of test data generation for object-oriented software using the EvoSuite tool. Results show that tuning does indeed have impact on the performance of a search algorithm. But, at least in the context of test data generation, it does not seem easy to find good settings that significantly outperform the “default” values suggested in the literature. This has very practical value for both researchers (e.g., when different techniques are compared) and practitioners. Using “default” values is a reasonable and justified choice, whereas parameter tuning is a long and expensive process that might or might not pay off in the end.


Search-based software engineering Test data generation Object-oriented Unit testing  Tuning EvoSuite Java Response surface Design of experiments 



Andrea Arcuri is funded by the Norwegian Research Council. This project has been funded by a Google Focused Research Award on “Test Amplification”.


  1. Ali S, Briand L, Hemmati H, Panesar-Walawege R (2010) A systematic review of the application and empirical investigation of search-based test-case generation. IEEE Trans Softw Eng 36(6):742–762CrossRefGoogle Scholar
  2. Arcuri A (2012) A theoretical and empirical analysis of the role of test sequence length in software testing for structural coverage. IEEE Trans Softw Eng 38(3):497–519CrossRefGoogle Scholar
  3. Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: ACM/IEEE International Conference on Software Engineering (ICSE), pp 1–10Google Scholar
  4. Arcuri A, Fraser G (2011) On parameter tuning in search based software engineering. In: International Symposium on Search Based Software Engineering (SSBSE), pp 33–47Google Scholar
  5. Arcuri A, Iqbal MZ, Briand L (2010) Black-box system testing of real-time embedded systems using random and search-based testing. In: IFIP International Conference on Testing Software and Systems (ICTSS), pp 95–110Google Scholar
  6. Bartz-Beielstein T, Markon S (2004) Tuning search algorithms for real-world applications: a regression tree based approach. In: IEEE Congress on Evolutionary Computation (CEC), pp 1111–1118Google Scholar
  7. Chernick M (1999) Bootstrap methods: a practitioner’s guide. Wiley Series in Probability and StatisticsGoogle Scholar
  8. Conrad A, Roos R, Kapfhammer G (2010) Empirically studying the role of selection operators during search-based test suite prioritization. In: Genetic and Evolutionary Computation Conference (GECCO). ACM, pp 1373–1380Google Scholar
  9. Da Costa L, Schoenauer M (2009) Bringing evolutionary computation to industrial applications with GUIDE. In: Genetic and Evolutionary Computation Conference (GECCO), pp 1467–1474Google Scholar
  10. De Jong K (2007) Parameter setting in EAs: a 30 year perspective. Parameter setting in evolutionary algorithms, pp 1–18Google Scholar
  11. Eiben A, Michalewicz Z, Schoenauer M, Smith J (2007) Parameter control in evolutionary algorithms. Parameter setting in evolutionary algorithms, pp 19–46Google Scholar
  12. Feldt R, Nordin P (2000) Using factorial experiments to evaluate the effect of genetic programming parameters. Genetic programming, pp 271–282Google Scholar
  13. Fraser G, Arcuri A (2011) It is not the length that matters, it is how you control it. In: IEEE International Conference on Software Testing, Verification and Validation (ICST), pp 150–159Google Scholar
  14. Fraser G, Arcuri A (2012a) The seed is strong: seeding strategies in search-based software testing. In: IEEE International Conference on Software Testing, verification and validation (ICST), pp 121–130Google Scholar
  15. Fraser G, Arcuri A (2012b) Sound empirical evidence in software testing. In: ACM/IEEE International Conference on Software Engineering (ICSE), pp 178–188Google Scholar
  16. Fraser G, Arcuri A (2013) Whole test suite generation. IEEE Trans Softw Eng 39(2):276–291CrossRefGoogle Scholar
  17. Fraser G, Zeller A (2010) Mutation-driven generation of unit tests and oracles. In: ACM Int. Symposium on Software Testing and Analysis (ISSTA). ACM, pp 147–158Google Scholar
  18. Gupta N, Stopfer M (2011) Negative results need airing too. Nature 470(39). doi: 10.1038/470039a
  19. Harman M, Mansouri SA, Zhang Y (2009) Search based software engineering: a comprehensive analysis and review of trends techniques and applications. Tech. Rep. TR-09-03, King’s CollegeGoogle Scholar
  20. Harrold MJ, Gupta R, Soffa ML (1993) A methodology for controlling the size of a test suite. ACM Trans Softw Eng Methodol 2(3):270–285CrossRefGoogle Scholar
  21. Ioannidis J (2005) Why most published research findings are false. PLoS Med 2(8):e124CrossRefGoogle Scholar
  22. Knight J (2003) Negative results: null and void. Nature 422(6932):554–555CrossRefGoogle Scholar
  23. Lenth R (2009) Response-surface methods in R, using RSM. J Stat Softw 32(7):1–17Google Scholar
  24. McMinn P (2004) Search-based software test data generation: a survey. Softw Test Verif Reliab 14(2):105–156CrossRefGoogle Scholar
  25. Mitchell T (1997) Machine learning. McGraw HillGoogle Scholar
  26. Myers R, Montgomery D, Anderson-Cook C (2009) Response surface methodology: process and product optimization using designed experiments, vol 705. WileyGoogle Scholar
  27. Poulding S, Clark J, Waeselynck H (2011) A principled evaluation of the effect of directed mutation on search-based statistical testing. In: International workshop on Search-Based Software Testing (SBST)Google Scholar
  28. R Development Core Team (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0
  29. Ribeiro J, Zenha-Rela M, de Vega F (2010) Adaptive evolutionary testing: an adaptive approach to search-based test case generation for object-oriented software. Nature Inspired Cooperative Strategies for Optimization (NICSO 2010), pp 185–197Google Scholar
  30. Ribeiro JCB (2008) Search-based test case generation for object-oriented Java software using strongly-typed genetic programming. In: Genetic and Evolutionary Computation Conference (GECCO). ACM, pp 1819–1822Google Scholar
  31. Schooler J (2011) Unpublished results hide the decline effect. Nature 470(437). doi: 10.1038/470437a
  32. Sharma R, Gligoric M, Arcuri A, Fraser G, Marinov D (2011) Testing container classes: Random or systematic? In: Fundamental Approaches to Software Engineering (FASE)Google Scholar
  33. Smart R (1964) The importance of negative results in psychological research. Can Psychol 5(4):225–232Google Scholar
  34. Smit S, Eiben A (2010) Parameter tuning of evolutionary algorithms: generalist vs. specialist. Applications of evolutionary computation, pp 542–551Google Scholar
  35. Tonella P (2004) Evolutionary testing of classes. In: ACM Int. Symposium on Software Testing and Analysis (ISSTA), pp 119–128Google Scholar
  36. Vos T, Baars A, Lindlar F, Kruse P, Windisch A, Wegener J (2010) Industrial scaled automated structural testing with the evolutionary testing tool. In: IEEE International Conference on Software Testing, verification and validation (ICST), pp 175–184Google Scholar
  37. Wappler S, Lammermann F (2005) Using evolutionary algorithms for the unit testing of object-oriented software. In: Genetic and Evolutionary Computation Conference (GECCO). ACM, pp 1053–1060Google Scholar
  38. Whitley D (2001) An overview of evolutionary algorithms: practical issues and common pitfalls. Inf Softw Technol 43(14):817–831CrossRefGoogle Scholar
  39. Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Certus Software V&V Center at Simula Research LaboratoryLysakerNorway
  2. 2.Department of Computer ScienceUniversity of SheffieldSheffieldUK

Personalised recommendations