Improving Temporal Difference Learning Performance in Backgammon Variants

  • Nikolaos Papahristou
  • Ioannis Refanidis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7168)

Abstract

Palamedes is an ongoing project for building expert playing bots that can play backgammon variants. As in all successful modern backgammon programs, it is based on neural networks trained using temporal difference learning. This paper improves upon the training method that we used in our previous approach for the two backgammon variants popular in Greece and neighboring countries, Plakoto and Fevga. We show that the proposed methods result both in faster learning as well as better performance. We also present insights into the selection of the features in our experiments that can be useful to temporal difference learning in other games as well.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    BackGammon Variants, http://www.bkgm.com/variants
  2. 2.
    Baxter, J., Tridgell, A., Weaver, L.: Knightcap: a chess program that learns by combining td(lambda) with game-tree search. In: Shavlik, J.W. (ed.) Proc. 15th International Conf. on Machine Learning, pp. 28–36. Morgan Kaufmann, San Francisco (2001)Google Scholar
  3. 3.
    Baxter, J., Tridgell, A., Weaver, L.: Tdleaf(): Combining temporal difference learning with game-tree search. Australian Journal of Intelligent Information Processing Systems 5(1), 39–43 (1998)Google Scholar
  4. 4.
    Hauk, T., Buro, M., Schaeffer, J.: *-Minimax Performance in Backgammon. In: van den Herik, H.J., Björnsson, Y., Netanyahu, N.S. (eds.) CG 2004. LNCS, vol. 3846, pp. 51–66. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Michie, D.: Game-playing and game-learning automata. In: Fox, L. (ed.) Advances in Programming and Non-Numerical Computation, pp. 183–200 (1966)Google Scholar
  6. 6.
  7. 7.
    Papahristou, N., Refanidis, I.: Training Neural Networks to Play Backgammon Variants Using Reinforcement Learning. In: Di Chio, C., Cagnoni, S., Cotta, C., Ebner, M., Ekárt, A., Esparcia-Alcázar, A.I., Merelo, J.J., Neri, F., Preuss, M., Richter, H., Togelius, J., Yannakakis, G.N. (eds.) EvoApplications 2011, Part I. LNCS, vol. 6624, pp. 113–122. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  8. 8.
    Pubeval source code backgammon benchmark player, http://www.bkgm.com/rgb/rgb.cgi?view+610
  9. 9.
    Schaeffer, J., Hlynka, M., Vili, J.: Temporal Difference Learning Applied to a High-Performance Game-Playing Program. In: Proceedings IJCAI, pp. 529–534 (2001)Google Scholar
  10. 10.
    Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning 3(1), 9–44 (1988)Google Scholar
  11. 11.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Indroduction. MIT Press (1998)Google Scholar
  12. 12.
    Szepesvári, C.: Algorithms for Reinforcement Learning (Electronic Draft Version) (August 2010), http://www.sztaki.hu/~szcsaba/papers/RLAlgsInMDPs-lecture.pdf
  13. 13.
  14. 14.
    Tesauro, G.: Practical issues in temporal differnce learning. Machine Learning 4, 257–277 (1992)Google Scholar
  15. 15.
    Tesauro, G.: Programming backgammon using self-teching neural nets. Artificial Intelligence 134, 181–199 (2002)MATHCrossRefGoogle Scholar
  16. 16.
    Tesauro, G.: Temporal Difference Learning and TD-Gammon. Communications of the ACM 38(3), 58–68 (1995)CrossRefGoogle Scholar
  17. 17.
    Veness, J., Silver, D., Uther, W., Blair, A.: Bootstrapping from Game Tree Search. In: Advances in Neural Information Processing Systems, vol. 22, pp. 1937–1945 (2009)Google Scholar
  18. 18.
    Wiering, M.A.: Self-Play and Using an Expert to Learn to Play Backgammon with Temporal Difference Learning. Journal of Intelligent Learning Systems and Applications 2, 57–68 (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Nikolaos Papahristou
    • 1
  • Ioannis Refanidis
    • 1
  1. 1.Department of Applied InformaticsUniversity of MacedoniaThessalonikiGreece

Personalised recommendations