Advertisement

Minimax Search and Reinforcement Learning for Adversarial Tetris

  • Maria Rovatsou
  • Michail G. Lagoudakis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6040)

Abstract

Game playing has always been considered an intellectual activity requiring a good level of intelligence. This paper focuses on Adversarial Tetris, a variation of the well-known Tetris game, introduced at the 3rd International Reinforcement Learning Competition in 2009. In Adversarial Tetris the mission of the player to complete as many lines as possible is actively hindered by an unknown adversary who selects the falling tetraminoes in ways that make the game harder for the player. In addition, there are boards of different sizes and learning ability is tested over a variety of boards and adversaries. This paper describes the design and implementation of an agent capable of learning to improve his strategy against any adversary and any board size. The agent employs MiniMax search enhanced with Alpha-Beta pruning for looking ahead within the game tree and a variation of the Least-Squares Temporal Difference Learning (LSTD) algorithm for learning an appropriate state evaluation function over a small set of features. The learned strategies exhibit good performance over a wide range of boards and adversaries.

Keywords

Reinforcement Learn Markovian Decision Process Board Size Game Tree Board Dimension 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Breukelaar, R., Demaine, E.D., Hohenberger, S., Hoogeboom, H.J., Kosters, W.A., Liben-Nowell, D.: Tetris is hard, even to approximate. International Journal of Computational Geometry and Applications 14(1-2), 41–68 (2004)zbMATHMathSciNetGoogle Scholar
  2. 2.
    Tsitsiklis, J.N., Roy, B.V.: Feature-based methods for large scale dynamic programming. Machine Learning, 59–94 (1994)Google Scholar
  3. 3.
    Reinforcement Learning Competition (2009), http://2009.rl-competition.org
  4. 4.
    Bradtke, S.J., Barto, A.G.: Linear least-squares algorithms for temporal difference learning. Machine Learning, 22–33 (1996)Google Scholar
  5. 5.
    Thiéry, C: Contrôle optimal stochastique et le jeu de Tetris. Master’s thesis, Université Henri Poincaré – Nancy I, France (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Maria Rovatsou
    • 1
  • Michail G. Lagoudakis
    • 1
  1. 1.Intelligent Systems Laboratory Department of Electronic and Computer EngineeringTechnical University of CreteChaniaGreece

Personalised recommendations