DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess
Abstract
We present an end-to-end learning method for chess, relying on deep neural networks. Without any a priori knowledge, in particular without any knowledge regarding the rules of chess, a deep neural network is trained using a combination of unsupervised pretraining and supervised training. The unsupervised training extracts high level features from a given position, and the supervised training learns to compare two chess positions and select the more favorable one. The training relies entirely on datasets of several million chess games, and no further domain specific knowledge is incorporated.
The experiments show that the resulting neural network (referred to as DeepChess) is on a par with state-of-the-art chess playing programs, which have been developed through many years of manual feature selection and tuning. DeepChess is the first end-to-end machine learning-based method that results in a grandmaster-level chess playing performance.
Keywords
Deep Neural Network High Level Feature Supervise Training Deep Belief Network Chess GameReferences
- 1.Baxter, J., Tridgell, A., Weaver, L.: Learning to play chess using temporal-differences. Mach. Learn. 40(3), 243–263 (2000)CrossRefzbMATHGoogle Scholar
- 2.Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: NIPS (2007)Google Scholar
- 3.David, O.E., Koppel, M., Netanyahu, N.S.: Genetic algorithms for mentor-assisted evaluation function optimization. In: GECCO (2008)Google Scholar
- 4.David, O.E., van den Herik, H.J., Koppel, M., Netanyahu, N.S.: Simulating human grandmasters: evolution and coevolution of evaluation functions. In: GECCO (2009)Google Scholar
- 5.David, O.E., Koppel, M., Netanyahu, N.S.: Expert-driven genetic algorithms for simulating evaluation functions. Genet. Program. Evolvable Mach. 12(1), 5–22 (2011)CrossRefGoogle Scholar
- 6.David, O.E., van den Herik, H.J., Koppel, M., Netanyahu, N.S.: Genetic algorithms for evolving computer chess programs. IEEE Trans. Evol. Comput. 18(5), 779–789 (2014)CrossRefGoogle Scholar
- 7.Elo, A.E.: The Rating of Chessplayers, Past and Present. Batsford, London (1978)Google Scholar
- 8.Hinton, G., Vinyals, O., Dean, J.: Distilling knowledge in a neural network. In: Deep Learning and Representation Learning Workshop, NIPS (2014)Google Scholar
- 9.Knuth, D.E., Moore, R.W.: An analysis of alpha-beta pruning. Artif. Intell. 6(4), 293–326 (1975)MathSciNetCrossRefzbMATHGoogle Scholar
- 10.Lai, M.: Giraffe: Using deep reinforcement learning to play chess. Master’s Thesis, Imperial College London (2015)Google Scholar
- 11.Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)Google Scholar
- 12.Romero, A., Ballas, N., Ebrahimi Kahou, S., Chassang, A., Gatta, C., Bengio, Y.: FitNets: hints for thin deep nets. In: ICLR (2015)Google Scholar
- 13.Schaeffer, J., Hlynka, M., Jussila, V.: Temporal difference learning applied to a high-performance game-playing program. In: Joint Conference on Artificial Intelligence (2001)Google Scholar
- 14.Schaeffer, J., Burch, N., Björnsson, Y., Kishimoto, A., Müller, M., Lake, R.: Checkers is solved. Science 317, 1518–1522 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
- 15.Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016)CrossRefGoogle Scholar
- 16.Tesauro, G.: Practical issues in temporal difference learning. Mach. Learn. 8(3–4), 257–277 (1992)zbMATHGoogle Scholar
- 17.Wiering, M.A.: TD learning of game evaluation functions with hierarchical neural architectures. Master’s Thesis, University of Amsterdam (1995)Google Scholar