Evolving Sampling Strategies for One-Shot Optimization Tasks
- 326 Downloads
One-shot optimization tasks require to determine the set of solution candidates prior to their evaluation, i.e., without possibility for adaptive sampling. We consider two variants, classic one-shot optimization (where our aim is to find at least one solution of high quality) and one-shot regression (where the goal is to fit a model that resembles the true problem as well as possible). For both tasks it seems intuitive that well-distributed samples should perform better than uniform or grid-based samples, since they show a better coverage of the decision space. In practice, quasi-random designs such as Latin Hypercube Samples and low-discrepancy point sets are indeed very commonly used designs for one-shot optimization tasks.
We study in this work how well low star discrepancy correlates with performance in one-shot optimization. Our results confirm an advantage of low-discrepancy designs, but also indicate the correlation between discrepancy values and overall performance is rather weak. We then demonstrate that commonly used designs may be far from optimal. More precisely, we evolve 24 very specific designs that each achieve good performance on one of our benchmark problems. Interestingly, we find that these specifically designed samples yield surprisingly good performance across the whole benchmark set. Our results therefore give strong indication that significant performance gains over state-of-the-art one-shot sampling techniques are possible, and that evolutionary algorithms can be an efficient means to evolve these.
KeywordsOne-shot optimization Regression Fully parallel search Surrogate-assisted optimization Continuous optimization
We thank François-Michel de Rainville for help with his implementation of the generalized Halton sequences. We also thank the reviewers for providing useful comments and references. This work was financially supported by the Paris Ile-de-France Region, by ANR-11-LABX-0056-LMH, by the Australian Research Council (ARC) through grant DP190103894, and by the South Australian Government through the Research Consortium “Unlocking Complex Resources through Lean Processing”. Moreover, P. Kerschke acknowledges support by the European Research Center for Information Systems (ERCIS).
- 2.Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. (JMLR) 13, 281–305 (2012). http://dl.acm.org/citation.cfm?id=2188395
- 3.Bossek, J., Doerr, C., Kerschke, P., Neumann, A., Neumann, F.: Github repository with project data (2020). https://github.com/jakobbossek/PPSN2020-oneshot/
- 4.Bousquet, O., Gelly, S., Kurach, K., Teytaud, O., Vincent, D.: Critical hyper-parameters: no random, no cry. arXiv preprint arXiv:1706.03200 (2017)
- 8.Carnell, R.: lhs: Latin Hypercube Samples, r package version 1.0.2 (2020). https://CRAN.R-project.org/package=lhs
- 9.Cauwet, M., et al.: Fully parallel hyperparameter search: reshaped space-filling. arXiv preprint arXiv:1910.08406 (2019)
- 14.Doerr, C., Gnewuch, M., Wahlström, M.: Calculation of discrepancy measures and applications. In: Chen, W., Srivastav, A., Travaglini, G. (eds.) A Panorama of Discrepancy Theory. LNM, vol. 2107, pp. 621–678. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-04696-9_10CrossRefzbMATHGoogle Scholar
- 15.Doerr, C., Rainville, F.D.: Constructing low star discrepancy point sets with genetic algorithms. In: Proceedings of Genetic and Evolutionary Computation Conference (GECCO), pp. 789–796. ACM (2013). https://doi.org/10.1145/2463372.2463469
- 19.Hansen, N., Auger, A., Mersmann, O., Tušar, T., Brockhoff, D.: COCO: a platform for comparing continuous optimizers in a black-box setting. arXiv e-prints arXiv:1603.08785 (2016)
- 20.Hansen, N., Finck, S., Ros, R., Auger, A.: Real-parameter black-box optimization benchmarking 2009: noiseless functions definitions. Technical report RR-6829, Inria (2009). https://hal.inria.fr/inria-00362633/document
- 24.Koksma, J.F.: Een algemeene stelling uit de theorie der gelijkmatige verdeeling modulo \(1\). Mathematica B (Zutphen) 11, 7–11 (1942/3)Google Scholar
- 28.Liu, L.: Could enough samples be more important than better designs for computer experiments? In: Proceedings of Annual Symposium on Simulation (ANSS 2005), pp. 107–115. IEEE (2005). https://doi.org/10.1109/ANSS.2005.17
- 30.McKay, M.D., Beckman, R.J., Conover, W.J.: A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21, 239–245 (1979). http://www.jstor.org/stable/1268522
- 32.Rapin, J., Gallagher, M., Kerschke, P., Preuss, M., Teytaud, O.: Exploring the MLDA benchmark on the nevergrad platform. In: Proceedings of the 21st Annual Conference on Genetic and Evolutionary Computation (GECCO 2019) Companion, pp. 1888–1896. ACM (2019). https://doi.org/10.1145/3319619.3326830
- 33.Rapin, J., Teytaud, O.: Nevergrad - A Gradient-Free Optimization Platform (2018). https://GitHub.com/FacebookResearch/Nevergrad