Skip to main content

CLOP: Confident Local Optimization for Noisy Black-Box Parameter Tuning

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 7168)

Abstract

Artificial intelligence in games often leads to the problem of parameter tuning. Some heuristics may have coefficients, and they should be tuned to maximize the win rate of the program. A possible approach is to build local quadratic models of the win rate as a function of program parameters. Many local regression algorithms have already been proposed for this task, but they are usually not sufficiently robust to deal automatically and efficiently with very noisy outputs and non-negative Hessians. The CLOP principle, which stands for Confident Local OPtimization, is a new approach to local regression that overcomes all these problems in a straightforward and efficient way. CLOP discards samples of which the estimated value is confidently inferior to the mean of all samples. Experiments demonstrate that, when the function to be optimized is smooth, this method outperforms all other tested algorithms.

Keywords

  • Quadratic Regression
  • Winter Simulation
  • Noisy Observation
  • Noisy Function
  • Noisy Optimization

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R.: The continuum-armed bandit problem. SIAM Journal on Control and Optimization 33(6), 1926–1951 (1995)

    CrossRef  MathSciNet  MATH  Google Scholar 

  2. Anderson, B.S., Moore, A.W., Cohn, D.: A nonparametric approach to noisy and costly optimization. In: Langley, P. (ed.) Proceedings of the Seventeenth International Conference on Machine Learning, pp. 17–24. Morgan Kaufmann (2000)

    Google Scholar 

  3. Boesch, E.: Minimizing the mean of a random variable with one real parameter (2010)

    Google Scholar 

  4. Box, G.E.P., Wilson, K.B.: On the experimental attainment of optimum conditions (with discussion). Journal of the Royal Statistical Society 13(1), 1–45 (1951)

    MathSciNet  Google Scholar 

  5. Branke, J., Meisel, S., Schmidt, C.: Simulated annealing in the presence of noise. Journal of Heuristics 14, 627–654 (2008)

    CrossRef  Google Scholar 

  6. Chaloner, K.: Bayesian design for estimating the turning point of a quadratic regression. Communications in Statistics—Theory and Methods 18(4), 1385–1400 (1989)

    CrossRef  MathSciNet  MATH  Google Scholar 

  7. Chang, K.H., Hong, L.J., Wan, H.: Stochastic trust region gradient-free method (STRONG)—a new response-surface-based algorithm in simulation optimization. In: Henderson, S.G., Biller, B., Hsieh, M.H., Shortle, J., Tew, J.D., Barton, R.R. (eds.) Proceedings of the 2007 Winter Simulation Conference, pp. 346–354 (2007)

    Google Scholar 

  8. Chaslot, G.M.J.B., Winands, M.H.M., Szita, I., van den Herik, H.J.: Cross-entropy for Monte-Carlo tree search. ICGA Journal 31(3), 145–156 (2008)

    Google Scholar 

  9. Chen, H.: Lower rate of convergence for locating a maximum of a function. The Annals of Statistics 16(3), 1330–1334 (1988)

    CrossRef  MathSciNet  MATH  Google Scholar 

  10. Coquelin, P.A., Munos, R.: Bandit algorithms for tree search. In: Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence (2007)

    Google Scholar 

  11. Deng, G., Ferris, M.C.: Adaptation of the UOBYQA algorithm for noisy functions. In: Perrone, L.F., Wieland, F.P., Liu, J., Lawson, B.G., Nicol, D.M., Fujimoto, R.M. (eds.) Proceedings of the 2006 Winter Simulation Conference, pp. 312–319 (2006)

    Google Scholar 

  12. Deng, G., Ferris, M.C.: Extension of the DIRECT optimization algorithm for noisy functions. In: Henderson, S.G., Biller, B., Hsieh, M.H., Shortle, J., Tew, J.D., Barton, R.R. (eds.) Proceedings of the 2007 Winter Simulation Conference, pp. 497–504 (2007)

    Google Scholar 

  13. Elster, C., Neumaier, A.: A method of trust region type for minimizing noisy functions. Computing 58(1), 31–46 (1997)

    CrossRef  MathSciNet  MATH  Google Scholar 

  14. Fackle Fornius, E.: Optimal Design of Experiments for the Quadratic Logistic Model. Ph.D. thesis, Department of Statistics, Stockholm University (2008)

    Google Scholar 

  15. Hansen, N., Niederberger, A.S.P., Guzzella, L., Koumoutsakos, P.: A method for handling uncertainty in evolutionary optimization with an application to feedback control of combustion. IEEE Transactions on Evolutionary Computation 13(1), 180–197 (2009)

    CrossRef  Google Scholar 

  16. Hu, J., Hu, P.: On the performance of the cross-entropy method. In: Rossetti, M.D., Hill, R.R., Johansson, B., Dunkin, A., Ingalls, R.G. (eds.) Proceedings of the 2009 Winter Simulation Conference, pp. 459–468 (2009)

    Google Scholar 

  17. Huang, D., Allen, T.T., Notz, W.I., Zeng, N.: Global optimization of stochastic black-box systems via sequential kriging meta-models. Journal of Global Optimization 34(3), 441–466 (2006)

    CrossRef  MathSciNet  MATH  Google Scholar 

  18. Hutter, F., Bartz-Beielstein, T., Hoos, H., Leyton-Brown, K., Murphy, K.: Sequential model-based parameter optimisation: an experimental investigation of automated and interactive approaches. In: Bartz-Beielstein, T., Chiarandini, M., Paquete, L., Preuß, M. (eds.) Empirical Methods for the Analysis of Optimization Algorithms, ch.15, pp. 361–411. Springer (2010)

    Google Scholar 

  19. Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. Journal of Global Optimization 13, 455–492 (1998)

    CrossRef  MathSciNet  MATH  Google Scholar 

  20. Kiefer, J., Wolfowitz, J.: Stochastic estimation of the maximum of a regression function. Annals of Mathematical Statistics 23(3), 462–466 (1952)

    CrossRef  MathSciNet  MATH  Google Scholar 

  21. Kirkpatrick, S., Gelatt Jr., C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)

    CrossRef  MathSciNet  MATH  Google Scholar 

  22. Kocsis, L., Szepesvári, C.: Bandit Based Monte-Carlo Planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)

    CrossRef  Google Scholar 

  23. Kocsis, L., Szepesvári, C.: Universal parameter optimisation in games based on SPSA. Machine Learning 63(3), 249–286 (2006)

    CrossRef  Google Scholar 

  24. Locatelli, M.: Simulated annealing algorithms for continuous global optimization. In: Handbook of Global Optimization II, pp. 179–230. Kluwer Academic Publishers (2002)

    Google Scholar 

  25. Moore, A.W., Schneider, J.G., Boyan, J.A., Lee, M.S.: Q2: Memory-based active learning for optimizing noisy continuous functions. In: Shavlik, J. (ed.) Proceedings of the Fifteenth International Conference of Machine Learning, pp. 386–394. Morgan Kaufmann (1998)

    Google Scholar 

  26. Salganicoff, M., Ungar, L.H.: Active exploration and learning in real-valued spaces using multi-armed bandit allocation indices. In: Prieditis, A., Russell, S.J. (eds.) Proceedings of the Twelfth International Conference on Machine Learning, pp. 480–487. Morgan Kaufmann (1995)

    Google Scholar 

  27. Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM Journal 3(3), 210–229 (1959)

    CrossRef  Google Scholar 

  28. Spall, J.C.: Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Transactions on Automatic Control 37, 332–341 (1992)

    CrossRef  MathSciNet  MATH  Google Scholar 

  29. Spall, J.C.: Feedback and weighting mechanisms for improving Jacobian estimates in the adaptive simultaneous perturbation algorithm. IEEE Transactions on Automatic Control 54(6), 1216–1229 (2009)

    CrossRef  MathSciNet  Google Scholar 

  30. Tesauro, G.: Temporal difference learning and TD-Gammon. Communications of the ACM 38(3), 58–68 (1995)

    CrossRef  Google Scholar 

  31. Villemonteix, J., Vazquez, E., Walter, E.: An informational approach to the global optimization of expensive-to-evaluate functions. Journal of Global Optimization (September 2008)

    Google Scholar 

  32. Wiens, D.P.: Robustness of design for the testing of lack of fit and for estimation in binary response models. Computational Statistics & Data Analysis 54(12), 3371–3378 (2010)

    CrossRef  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Coulom, R. (2012). CLOP: Confident Local Optimization for Noisy Black-Box Parameter Tuning. In: van den Herik, H.J., Plaat, A. (eds) Advances in Computer Games. ACG 2011. Lecture Notes in Computer Science, vol 7168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31866-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31866-5_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31865-8

  • Online ISBN: 978-3-642-31866-5

  • eBook Packages: Computer ScienceComputer Science (R0)