Minimax Search and Reinforcement Learning for Adversarial Tetris
Game playing has always been considered an intellectual activity requiring a good level of intelligence. This paper focuses on Adversarial Tetris, a variation of the well-known Tetris game, introduced at the 3rd International Reinforcement Learning Competition in 2009. In Adversarial Tetris the mission of the player to complete as many lines as possible is actively hindered by an unknown adversary who selects the falling tetraminoes in ways that make the game harder for the player. In addition, there are boards of different sizes and learning ability is tested over a variety of boards and adversaries. This paper describes the design and implementation of an agent capable of learning to improve his strategy against any adversary and any board size. The agent employs MiniMax search enhanced with Alpha-Beta pruning for looking ahead within the game tree and a variation of the Least-Squares Temporal Difference Learning (LSTD) algorithm for learning an appropriate state evaluation function over a small set of features. The learned strategies exhibit good performance over a wide range of boards and adversaries.
KeywordsReinforcement Learn Markovian Decision Process Board Size Game Tree Board Dimension
Unable to display preview. Download preview PDF.
- 2.Tsitsiklis, J.N., Roy, B.V.: Feature-based methods for large scale dynamic programming. Machine Learning, 59–94 (1994)Google Scholar
- 3.Reinforcement Learning Competition (2009), http://2009.rl-competition.org
- 4.Bradtke, S.J., Barto, A.G.: Linear least-squares algorithms for temporal difference learning. Machine Learning, 22–33 (1996)Google Scholar
- 5.Thiéry, C: Contrôle optimal stochastique et le jeu de Tetris. Master’s thesis, Université Henri Poincaré – Nancy I, France (2007)Google Scholar