Advertisement

Fast Seed-Learning Algorithms for Games

  • Jialin LiuEmail author
  • Olivier Teytaud
  • Tristan Cazenave
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10068)

Abstract

Recently, a methodology has been proposed for boosting the computational intelligence of randomized game-playing programs. We propose faster variants of these algorithms, namely rectangular algorithms (fully parallel) and bandit algorithms (faster in a sequential setup). We check the performance on several board games and card games. In addition, in the case of Go, we check the methodology when the opponent is completely distinct to the one used in the training.

Keywords

Success Rate Original Algorithm Random Seed Board Game Simulated Game 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in a rigged casino: the adversarial multi-armed bandit problem. In: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, pp. 322–331. IEEE Computer Society Press, Los Alamitos (1995)Google Scholar
  2. 2.
    Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996). http://www.citeseer.ist.psu.edu/breiman96bagging.html Google Scholar
  3. 3.
    Breuker, D., Uiterwijk, J., van den Herik, H.: Solving 8\( \times 8\) domineering. Theor. Comput, Sci. 230(1–2), 195–206 (2000). http://www.sciencedirect.com/science/article/pii/S0304397599000821 MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Bullock, N.: Domineering: solving large combinatorial search spaces. ICGA J. 25(2), 67–84 (2002)MathSciNetGoogle Scholar
  5. 5.
    Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: Ciancarini, P., van den Herik, H.J., Donkers, H.H.L.M. (eds.) Proceedings of the 5th International Conference on Computers and Games, pp. 72–83, Italy, Turin (2006)Google Scholar
  6. 6.
    Gardner, M.: Mathematical games. Sci. Am. 230, 106–108 (1974)CrossRefGoogle Scholar
  7. 7.
    Gaudel, R., Hoock, J.B., Pérez, J., Sokolovska, N., Teytaud, O.: A principled method for exploiting opening books. In: International Conference on Computers and Games, pp. 136–144, Kanazawa, Japon (2010). http://hal.inria.fr/inria-00484043 Google Scholar
  8. 8.
    Grigoriadis, M.D., Khachiyan, L.G.: A sublinear-time randomized approximation algorithm for matrix games. Oper. Res. Lett. 18(2), 53–58 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006). doi: 10.1007/11871842_29 CrossRefGoogle Scholar
  11. 11.
    Nagarajan, V., Marcolino, L.S., Tambe, M.: Every team deserves a second chance: identifying when things go wrong (student abstract version). In: 29th Conference on Artificial Intelligence (AAAI 2015), Texas, USA (2015)Google Scholar
  12. 12.
    Saint-Pierre, D.L., Teytaud, O.: Nash and the bandit approach for adversarial portfolios. In: CIG 2014 - Computational Intelligence in Games, pp. 1–7. IEEE, Dortmund, August 2014.https://hal.inria.fr/hal-01077628
  13. 13.
    Shapire, R., Freund, Y., Bartlett, P., Lee, W.: Boosting the margin: a new explanation for the effectiveness of voting methods, pp. 322–330 (1997)Google Scholar
  14. 14.
    Uiterwijk, J.W.H.M.: Perfectly solving domineering boards. In: Cazenave, T., Winands, M.H.M., Iida, H. (eds.) CGW 2013. CCIS, vol. 408, pp. 97–121. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-05428-5_8 CrossRefGoogle Scholar
  15. 15.
    Wang, Y., Audibert, J.Y., Munos, R.: Algorithms for infinitely many-armed bandits. In: Advances in Neural Information Processing Systems, vol. 21 (2008)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.TAO, Inria, University of Paris-Sud, UMR CNRS 8623Gif-sur-yvetteFrance
  2. 2.LAMSADE, Université Paris-DauphineParisFrance

Personalised recommendations