Abstract
Designing artificial players for the game of Tetris is a challenging problem that many authors addressed using different methods. Very performing implementations using evolution strategies have also been proposed. However one drawback of using evolution strategies for this problem can be the cost of evaluations due to the stochastic nature of the fitness function. This paper describes the use of racing algorithms to reduce the amount of evaluations of the fitness function in order to reduce the learning time. Different experiments illustrate the benefits and the limitation of racing in evolution strategies for this problem. Among the benefits is designing artificial players at the level of the top ranked players at a third of the cost.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Audibert, J.-Y., Munos, R., Szepesvári, C.: Tuning Bandit Algorithms in Stochastic Environments. In: Hutter, M., Servedio, R.A., Takimoto, E. (eds.) ALT 2007. LNCS (LNAI), vol. 4754, pp. 150–165. Springer, Heidelberg (2007)
Bertsekas, D., Tsitsiklis, J.: Neuro-Dynamic Programming. Athena Scientific (1996)
de Boer, P., Kroese, D., Mannor, S., Rubinstein, R.: A tutorial on the cross-entropy method. Annals of Operations Research 1(134), 19–67 (2004)
Böhm, N., Kókai, G., Mandl, S.: An Evolutionary Approach to Tetris. In: University of Vienna Faculty of Business; Economics, Statistics (eds.) Proc. of the 6th Metaheuristics International Conference, CDROM (2005)
Boumaza, A.: On the evolution of artificial tetris players. In: Proc. of the IEEE Symp. on Comp. Intel. and Games, CIG 2009, pp. 387–393. IEEE (June 2009)
Burgiel, H.: How to lose at Tetris. Mathematical Gazette 81, 194–200 (1997)
Demaine, E.D., Hohenberger, S., Liben-Nowell, D.: Tetris is Hard, Even to Approximate. In: Warnow, T.J., Zhu, B. (eds.) COCOON 2003. LNCS, vol. 2697, pp. 351–363. Springer, Heidelberg (2003)
Fahey, C.P.: Tetris AI, Computer plays Tetris (2003), on the web http://colinfahey.com/tetris/tetris_en.html
Farias, V., van Roy, B.: Tetris: A study of randomized constraint sampling. Springer (2006)
Hansen, N., Müller, S., Koumoutsakos, P.: Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evolutionary Computation 11(1), 1–18 (2003)
Hansen, N., Niederberger, S., Guzzella, L., Koumoutsakos, P.: A method for handling uncertainty in evolutionary optimization with an application to feedback control of combustion. IEEE Trans. Evol. Comp. 13(1), 180–197 (2009)
Heidrich-Meisner, V., Igel, C.: Hoeffding and bernstein races for selecting policies in evolutionary direct policy search. In: Proc. of the 26th ICML, pp. 401–408. ACM, New York (2009)
Maron, O., Moore, A.W.: Hoeffding races: Accelerating model selection search for classification and function approximation. In: Proc. Advances in Neural Information Processing Systems, pp. 59–66. Morgan Kaufmann (1994)
Ostermeier, A., Gawelczyk, A., Hansen, N.: A derandomized approach to self-adaptation of evolution strategies. Evolutionary Computation 2(4), 369–380 (1994)
Schmidt, C., Branke, J., Chick, S.E.: Integrating Techniques from Statistical Ranking into Evolutionary Algorithms. In: Rothlauf, F., Branke, J., Cagnoni, S., Costa, E., Cotta, C., Drechsler, R., Lutton, E., Machado, P., Moore, J.H., Romero, J., Smith, G.D., Squillero, G., Takagi, H. (eds.) EvoWorkshops 2006. LNCS, vol. 3907, pp. 752–763. Springer, Heidelberg (2006)
Siegel, E.V., Chaffee, A.D.: Genetically optimizing the speed of programs evolved to play tetris. In: Angeline, P.J., Kinnear Jr., K.E. (eds.) Advances in Genetic Programming 2, pp. 279–298. MIT Press, Cambridge (1996)
Stagge, P.: Averaging Efficiently in the Presence of Noise. In: Eiben, A.E., Bäck, T., Schoenauer, M., Schwefel, H.-P. (eds.) PPSN 1998. LNCS, vol. 1498, pp. 188–197. Springer, Heidelberg (1998)
Szita, I., Lörincz, A.: Learning tetris using the noisy cross-entropy method. Neural Comput. 18(12), 2936–2941 (2006)
Thiery, C., Scherrer, B.: Building Controllers for Tetris. International Computer Games Association Journal 32, 3–11 (2009)
Thiery, C., Scherrer, B.: Least-Squares λ Policy Iteration: Bias-Variance Trade-off in Control Problems. In: Proc. ICML, Haifa (2010)
Tsitsiklis, J.N., van Roy, B.: Feature-based methods for large scale dynamic programming. Machine Learning 22, 59–94 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Boumaza, A. (2012). Reducing the Learning Time of Tetris in Evolution Strategies. In: Hao, JK., Legrand, P., Collet, P., Monmarché, N., Lutton, E., Schoenauer, M. (eds) Artificial Evolution. EA 2011. Lecture Notes in Computer Science, vol 7401. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35533-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-35533-2_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35532-5
Online ISBN: 978-3-642-35533-2
eBook Packages: Computer ScienceComputer Science (R0)