Advertisement

Mimicking Go Experts with Convolutional Neural Networks

  • Ilya Sutskever
  • Vinod Nair
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5164)

Abstract

Building a strong computer Go player is a longstanding open problem. In this paper we consider the related problem of predicting the moves made by Go experts in professional games. The ability to predict experts’ moves is useful, because it can, in principle, be used to narrow the search done by a computer Go player. We applied an ensemble of convolutional neural networks to this problem. Our main result is that the ensemble learns to predict 36.9% of the moves made in test expert Go games, improving upon the state of the art, and that the best single convolutional neural network of the ensemble achieves 34% accuracy. This network has less than 104 parameters.

Keywords

Go Move prediction Convolutional Neural Networks 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    van der Werf, E.: AI Techniques for the Game of Go. UPM, Universitaire Pers Maastricht (2004)Google Scholar
  2. 2.
    Müller, M.: Review: Computer Go 1984-2000. Lecture Notes In Computer Science, 405–413 (2000)Google Scholar
  3. 3.
    Bouzy, B., Cazenave, T.: Computer Go: An AI oriented survey. Artificial Intelligence 132(1), 39–103 (2001)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Schaeffer, J., Burch, N., Bjornsson, Y., Kishimoto, A., Muller, M., Lake, R., Lu, P., Sutphen, S.: Checkers Is Solved. Science 317(5844), 1518 (2007)CrossRefMathSciNetGoogle Scholar
  5. 5.
    LeCun, Y., Boser, B., Denker, J., Howard, R., Habbard, W., Jackel, L., Henderson, D.: Handwritten digit recognition with a back-propagation network. Advances in neural information processing systems 2 table of contents, 396–404 (1990)Google Scholar
  6. 6.
    Schraudolph, N., Dayan, P., Sejnowski, T.: Temporal Difference Learning of Position Evaluation in the Game of Go. Advances in Neural Information Processing Systems 6, 817–824 (1994)Google Scholar
  7. 7.
    Stern, D., Herbrich, R., Graepel, T.: Bayesian pattern ranking for move prediction in the game of Go. In: Proc. of the 23rd international conference on Machine learning, pp. 873–880 (2006)Google Scholar
  8. 8.
    Hall, M.T., Fairbairn, J.: The Gogod Database and Encyclopaedia (2006), www.gogod.co.uk
  9. 9.
    Simard, P., Steinkraus, D., Platt, J.: Best practices for convolutional neural networks applied to visual document analysis. Document Analysis and Recognition, 958–963 (2003)Google Scholar
  10. 10.
    Ranzato, M., LeCun, Y.: A sparse and locally shift invariant feature extractor applied to document images. In: Proc. International Conference on Document Analysis and Recognition (ICDAR) (2007)Google Scholar
  11. 11.
    LeCun, Y., Huang, F., Bottou, L.: Learning methods for generic object recognition with invariance to pose and lighting. Computer Vision and Pattern Recognition 2 (2004)Google Scholar
  12. 12.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE 86(11) (1998)Google Scholar
  13. 13.
    van der Werf, E., Uiterwijk, J., Postma, E., van den Herik, J.: Local Move Prediction in Go. Computers and Games (2003)Google Scholar
  14. 14.
    Enzenberger, M.: Evaluation in Go by a Neural Network using Soft Segmentation. Advances in Computer Games 10 (2003)Google Scholar
  15. 15.
    Tesauro, G.: Temporal difference learning and TD-Gammon. Communications of the ACM 38(3), 58–68 (1995)CrossRefGoogle Scholar
  16. 16.
    Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
  17. 17.
    Brugmann, B.: Monte Carlo Go (1993)Google Scholar
  18. 18.
    Gelly, S., Wang, Y.: Exploration exploitation in Go: UCT for Monte-Carlo Go. In: NIPS-2006: On-line trading of Exploration and Exploitation Workshop, Whistler, Canada (2006)Google Scholar
  19. 19.
    Wu, L., Baldi, P.: A Scalable Machine Learning Approach to Go. Neural Information Processing Systems, 1521–1528 (2007)Google Scholar
  20. 20.
    Araki, N., Yoshida, K., Tsuruoka, Y., Tsujii, J.: Move Prediction in Go with the Maximum Entropy Method. In: Proceedings of the 2007 IEEE Symposium on Computational Intelligence and Games (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Ilya Sutskever
    • 1
  • Vinod Nair
    • 1
  1. 1.Department of Computer ScienceUniversity of TorontoTorontoCanada

Personalised recommendations