Abstract
Monte Carlo Tree Search (MCTS) is a novel machine learning paradigm that is used to find good solutions for complex optimization problems with very large search spaces (like playing GO). We combine MCTS with Covariance Matrix Adaptation Evolution Strategies (CMA-ES) to efficiently optimize real-parameter single objective problems by balancing the exploitation of promising areas with the exploration of new regions of the search space. The novel algorithm is called hierarchical CMA-ES and it is influenced by both machine learning and evolutionary computation research areas. Like in evolutionary computation, we use a population of individuals to explore the commonalities of CMA-ES solvers. These CMA-ES solvers are structured using a MCTS tree like structure. Our experiments compare the performance of hierarchical CMA-ES solvers with two other algorithms: the standard CMA-ES optimizer, and an adaptation of MCTS to solve real-parameter problems. The hierarchical CMA-ES optimizer has the best empirical performance on several benchmark problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Auer, P.: Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3, 397–422 (2002)
Auger, A., Hansen, N.: Tutorial CMA-ES: evolution strategies and covariance matrix adaptation. In: Genetic and Evolutionary Computation Conference, GECCO 2013, pp. 499–520 (2013)
Beyer, H.-G., Schwefel, H.-P.: Evolution strategies: a comprehensive introduction. J. Nat. Comput. 1(1), 3–52 (2002)
Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P.I., Rohlfshanger, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of monte carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–46 (2012)
Bubeck, S., Cesa-Bianchi, N.: Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems. In: Foundations and Trends in Machine Learning, vol. 5 (2012)
Couetoux, A.: Monte Carlo Tree Search for Continuous and Stochastic Sequential Decision Making Problems. PhD thesis, Université Paris Sud - Paris XI (2013)
Drugan, M.M., Isasi, P., Manderick, B.: Schemata bandits for binary encoded combinatorial optimisation problems. In: Simulated Evolution and Learning - 10th International Conference (SEAL), pp. 299–310 (2014)
Igel, C., Hansen, N., Roth, S.: Covariance matrix adaptation for multi-objective optimization. Evol. Comput. 15(1), 1–28 (2007)
Kocsis, L., Szepesvari, C.: Bandit based monte-carlo planning. In: Machine Learning: European Conference of Machine Learning (ECML) (2006)
Liang, J.J., Qu, B.Y., Suganthan, P.N., Chen, Q.: Problem definitions and evaluation criteria for the cec 2015 competition on learning-based real-parameter single objective optimization. Technical Report Technical Report 201411A, Computational Intelligence Laboratory, Zhengzhou University, Zhengzhou China (2015)
Munos, R.: From bandits to monte-carlo tree search: the optimistic principle applied to optimization and planning. Found. Trends Mach. Learn. 7(1), 1–129 (2014)
Acknowledgements
Madalina M. Drugan was supported by FWO project G.087814N “Multi-criteria RL”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Drugan, M.M. (2018). Efficient Real-Parameter Single Objective Optimizer Using Hierarchical CMA-ES Solvers. In: Tantar, AA., Tantar, E., Emmerich, M., Legrand, P., Alboaie, L., Luchian, H. (eds) EVOLVE - A Bridge between Probability, Set Oriented Numerics, and Evolutionary Computation VI. Advances in Intelligent Systems and Computing, vol 674. Springer, Cham. https://doi.org/10.1007/978-3-319-69710-9_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-69710-9_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69708-6
Online ISBN: 978-3-319-69710-9
eBook Packages: EngineeringEngineering (R0)