Advertisement

Evolving Sampling Strategies for One-Shot Optimization Tasks

  • Jakob BossekEmail author
  • Carola Doerr
  • Pascal Kerschke
  • Aneta Neumann
  • Frank Neumann
Conference paper
  • 326 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12269)

Abstract

One-shot optimization tasks require to determine the set of solution candidates prior to their evaluation, i.e., without possibility for adaptive sampling. We consider two variants, classic one-shot optimization (where our aim is to find at least one solution of high quality) and one-shot regression (where the goal is to fit a model that resembles the true problem as well as possible). For both tasks it seems intuitive that well-distributed samples should perform better than uniform or grid-based samples, since they show a better coverage of the decision space. In practice, quasi-random designs such as Latin Hypercube Samples and low-discrepancy point sets are indeed very commonly used designs for one-shot optimization tasks.

We study in this work how well low star discrepancy correlates with performance in one-shot optimization. Our results confirm an advantage of low-discrepancy designs, but also indicate the correlation between discrepancy values and overall performance is rather weak. We then demonstrate that commonly used designs may be far from optimal. More precisely, we evolve 24 very specific designs that each achieve good performance on one of our benchmark problems. Interestingly, we find that these specifically designed samples yield surprisingly good performance across the whole benchmark set. Our results therefore give strong indication that significant performance gains over state-of-the-art one-shot sampling techniques are possible, and that evolutionary algorithms can be an efficient means to evolve these.

Keywords

One-shot optimization Regression Fully parallel search Surrogate-assisted optimization Continuous optimization 

Notes

Acknowledgments

We thank François-Michel de Rainville for help with his implementation of the generalized Halton sequences. We also thank the reviewers for providing useful comments and references. This work was financially supported by the Paris Ile-de-France Region, by ANR-11-LABX-0056-LMH, by the Australian Research Council (ARC) through grant DP190103894, and by the South Australian Government through the Research Consortium “Unlocking Complex Resources through Lean Processing”. Moreover, P. Kerschke acknowledges support by the European Research Center for Information Systems (ERCIS).

References

  1. 1.
    Beck, J.: Irregularities of Distribution. Cambridge University Press, Cambridge (1987)CrossRefGoogle Scholar
  2. 2.
    Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. (JMLR) 13, 281–305 (2012). http://dl.acm.org/citation.cfm?id=2188395
  3. 3.
    Bossek, J., Doerr, C., Kerschke, P., Neumann, A., Neumann, F.: Github repository with project data (2020). https://github.com/jakobbossek/PPSN2020-oneshot/
  4. 4.
    Bousquet, O., Gelly, S., Kurach, K., Teytaud, O., Vincent, D.: Critical hyper-parameters: no random, no cry. arXiv preprint arXiv:1706.03200 (2017)
  5. 5.
    Braaten, E., Weller, G.: An improved low-discrepancy sequence for multidimensional quasi-Monte Carlo integration. J. Comput. Phys. 33(2), 249–258 (1979)CrossRefGoogle Scholar
  6. 6.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001).  https://doi.org/10.1023/A:1010933404324CrossRefzbMATHGoogle Scholar
  7. 7.
    Breiman, L., Friedman, J.H., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software, Monterey (1984).  https://doi.org/10.1201/9781315139470CrossRefzbMATHGoogle Scholar
  8. 8.
    Carnell, R.: lhs: Latin Hypercube Samples, r package version 1.0.2 (2020). https://CRAN.R-project.org/package=lhs
  9. 9.
    Cauwet, M., et al.: Fully parallel hyperparameter search: reshaped space-filling. arXiv preprint arXiv:1910.08406 (2019)
  10. 10.
    Chilès, J.-P., Desassis, N.: Fifty years of Kriging. In: Daya Sagar, B.S., Cheng, Q., Agterberg, F. (eds.) Handbook of Mathematical Geosciences, pp. 589–612. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-78999-6_29CrossRefGoogle Scholar
  11. 11.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995).  https://doi.org/10.1007/BF00994018CrossRefzbMATHGoogle Scholar
  12. 12.
    Crombecq, K., Laermans, E., Dhaene, T.: Efficient space-filling and non-collapsing sequential design strategies for simulation-based modeling. Eur. J. Oper. Res. 214(3), 683–696 (2011).  https://doi.org/10.1016/j.ejor.2011.05.032CrossRefGoogle Scholar
  13. 13.
    Dobkin, D.P., Eppstein, D., Mitchell, D.P.: Computing the discrepancy with applications to supersampling patterns. ACM Trans. Graph. 15, 354–376 (1996)CrossRefGoogle Scholar
  14. 14.
    Doerr, C., Gnewuch, M., Wahlström, M.: Calculation of discrepancy measures and applications. In: Chen, W., Srivastav, A., Travaglini, G. (eds.) A Panorama of Discrepancy Theory. LNM, vol. 2107, pp. 621–678. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-04696-9_10CrossRefzbMATHGoogle Scholar
  15. 15.
    Doerr, C., Rainville, F.D.: Constructing low star discrepancy point sets with genetic algorithms. In: Proceedings of Genetic and Evolutionary Computation Conference (GECCO), pp. 789–796. ACM (2013).  https://doi.org/10.1145/2463372.2463469
  16. 16.
    Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing. NCS. Springer, Heidelberg (2015).  https://doi.org/10.1007/978-3-662-44874-8CrossRefzbMATHGoogle Scholar
  17. 17.
    Forrester, A.I.J., Sobester, A., Keane, A.J.: Engineering Design via Surrogate Modelling - A Practical Guide. Wiley, Chichester (2008)CrossRefGoogle Scholar
  18. 18.
    Halton, J.H.: Algorithm 247: radical-inverse quasi-random point sequence. Commun. ACM 7(12), 701–702 (1964).  https://doi.org/10.1145/355588.365104CrossRefGoogle Scholar
  19. 19.
    Hansen, N., Auger, A., Mersmann, O., Tušar, T., Brockhoff, D.: COCO: a platform for comparing continuous optimizers in a black-box setting. arXiv e-prints arXiv:1603.08785 (2016)
  20. 20.
    Hansen, N., Finck, S., Ros, R., Auger, A.: Real-parameter black-box optimization benchmarking 2009: noiseless functions definitions. Technical report RR-6829, Inria (2009). https://hal.inria.fr/inria-00362633/document
  21. 21.
    Hlawka, E.: Funktionen von beschränkter variation in der theorie der gleichverteilung. Ann. Mat. Pura Appl. 54, 325–333 (1961).  https://doi.org/10.1007/BF02415361MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Jin, R., Chen, W., Sudjianto, A.: An efficient algorithm for constructing optimal design of computer experiments. J. Stat. Plan. Infer. 134(1), 268–287 (2005).  https://doi.org/10.1016/j.jspi.2004.02.014MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Global Optim. 13, 455–492 (1998).  https://doi.org/10.1023/A:1008306431147MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Koksma, J.F.: Een algemeene stelling uit de theorie der gelijkmatige verdeeling modulo \(1\). Mathematica B (Zutphen) 11, 7–11 (1942/3)Google Scholar
  25. 25.
    Kuipers, L., Niederreiter, H.: Uniform Distribution of Sequences. Wiley, New York (1974)zbMATHGoogle Scholar
  26. 26.
    Lemieux, C.: Monte Carlo and Quasi-Monte Carlo Sampling. Springer, New York (2009).  https://doi.org/10.1007/978-0-387-78165-5CrossRefzbMATHGoogle Scholar
  27. 27.
    Leobacher, G., Pillichshammer, F.: Introduction to Quasi-Monte Carlo Integration and Applications. CTM. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-03425-6CrossRefzbMATHGoogle Scholar
  28. 28.
    Liu, L.: Could enough samples be more important than better designs for computer experiments? In: Proceedings of Annual Symposium on Simulation (ANSS 2005), pp. 107–115. IEEE (2005).  https://doi.org/10.1109/ANSS.2005.17
  29. 29.
    Matoušek, J.: Geometric Discrepancy, 2nd edn. Springer, Berlin (2009).  https://doi.org/10.1007/978-3-642-03942-3CrossRefzbMATHGoogle Scholar
  30. 30.
    McKay, M.D., Beckman, R.J., Conover, W.J.: A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21, 239–245 (1979). http://www.jstor.org/stable/1268522
  31. 31.
    Rainville, F.D., Gagné, C., Teytaud, O., Laurendeau, D.: Evolutionary optimization of low-discrepancy sequences. ACM Trans. Model. Comput. Simul. 22, 9:1–9:25 (2012).  https://doi.org/10.1145/2133390.2133393MathSciNetCrossRefzbMATHGoogle Scholar
  32. 32.
    Rapin, J., Gallagher, M., Kerschke, P., Preuss, M., Teytaud, O.: Exploring the MLDA benchmark on the nevergrad platform. In: Proceedings of the 21st Annual Conference on Genetic and Evolutionary Computation (GECCO 2019) Companion, pp. 1888–1896. ACM (2019).  https://doi.org/10.1145/3319619.3326830
  33. 33.
    Rapin, J., Teytaud, O.: Nevergrad - A Gradient-Free Optimization Platform (2018). https://GitHub.com/FacebookResearch/Nevergrad

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Jakob Bossek
    • 1
    Email author
  • Carola Doerr
    • 2
  • Pascal Kerschke
    • 3
  • Aneta Neumann
    • 1
  • Frank Neumann
    • 1
  1. 1.The University of AdelaideAdelaideAustralia
  2. 2.Sorbonne Université, CNRS, LIP6ParisFrance
  3. 3.University of MünsterMünsterGermany

Personalised recommendations