Learning of Go Board State Evaluation Function by Artificial Neural Network
We construct an artificial neural network called T361G to evaluate Go board state (expected winning probability of Black’s/White’s win conditioned on the current board state in Black’s/White’s turn). Different from the existing Mote-Carlo Go , which evaluates the next move (the next board state) by performing random simulations in every turn, we use a large number of experts’ game records of Go as training data in order for T361G to learn the evaluation function of Go board states. We reduce the number of parameters to be learned by taking Go-specific properties into account. It is shown that T361G predicts the winning probability fairly well with avoiding overtraining, even from insufficient amount of data.
KeywordsGo neural network supervised learning
Unable to display preview. Download preview PDF.
- 3.Bruegmann, B.: Monte Carlo Go, ftp://ftp-igs.joyjoy.net/go/computer/mcgo.tex.z
- 4.Kocsis, L., Szepesvari, C.: Bandit based Monte-Carlo planning. In: 15th European Conference on Machine Learning, pp. 282–293 (2006)Google Scholar
- 5.Newborn, M.: Computer Chess Comes of Age. Springer, Heidelberg (1996)Google Scholar