First Results from Using Temporal Difference Learning in Shogi

  • Donald F. Beal
  • Martin C. Smith
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1558)


This paper describes first results from the application of Temporal Difference learning [1] to shogi. We report on experiments to determine whether sensible values for shogi pieces can be obtained in the same manner as for western chess pieces [2]. The learning is obtained entirely from randomised self-play, without access to any form of expert knowledge. The piece values are used in a simple search program that chooses shogi moves from a shallow lookahead, using pieces values to evaluate the leaves, with a random tie-break at the top level. Temporal difference learning is used to adjust the piece values over the course of a series of games. The method is successful in learning values that perform well in matches against hand-crafted values.


Learning Shogi Temporal Difference Minimax Search Gameplaying 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Sutton, R.S.: Learning to Predict by the Methods of Temporal Differences. Machine Learning 3 (1988) 9–44Google Scholar
  2. 2.
    Beal, D.F. and Smith, M.C.: Learning Piece Values Using Temporal Differences International Computer Chess Association Journal, Vol. 20, No. 3 (1997) 147–151Google Scholar
  3. 3.
    Levinson, R. and Snyder, R.: Adaptive Pattern Oriented Chess. Proceedings of AAAI-91, Morgan-Kaufman (1991) 601–605Google Scholar
  4. 4.
    Christensen, J. and Korf, R.: A Unified Theory of Heuristic Evaluation Functions and its Application to Learning.. AAAI-86, Morgan-Kaufman (1986) 148–152Google Scholar
  5. 5.
    Baxter, J., Tridgell, A. and Weaver, L.: KnightCap: A chess program that learns by combining TD(lambda) with game-tree search. In: Machine Learning, Proceedings of the Fifteenth International Conference (ICML’ 98), Madison (1998) 28–36Google Scholar
  6. 6.
    Fairbairn, J.: Shogi for Beginners. Ishi Press International (1989)Google Scholar
  7. 7.
    Leggett, T.: Shogi: Japan’s Game of Strategy. Charles E. Tuttle Company [Reprinted in 1993, first published in 1966]Google Scholar
  8. 8.
    Matsubara, H., Iida, H. and Grimbergen, R.: Natural Developments in Game Research: From Chess to Shogi to Go International Computer Chess Association Journal, Vol. 19, No. 2 (1996) 103–112Google Scholar
  9. 9.
    Tesauro, G.: Practical Issues in Temporal Difference Learning. Machine Learning 8 (1988) 9–44Google Scholar
  10. 10.
    Tesauro, G.: TD-Gammon, a Self-Teaching Backgammon Program, achieves Master Level Play. Neural Computation, Vol. 6, No. 2 (1994) 215–219CrossRefGoogle Scholar
  11. 11.
    Marsland, T.A.: Computer Chess and Search. In: Shapiro, S. (ed.) Encyclopaedia of Artificial Intelligence. 2nd edn. J. Wiley & Sons (1992)Google Scholar
  12. 12.
    Beal, D.F.: Experiments with the Null Move. In: Beal, D.F. (ed.) Advances in Computer Chess 5. Elsevier Science Publishers (1989) 65–79Google Scholar
  13. 13.
    Donninger, C.: Null Move and Deep Search: Selective Search Heuristics for Obtuse Chess Programs. International Computer Chess Association Journal, Vol. 16, No. 3 (1993) 137–143Google Scholar
  14. 14.
    Mutz, M.: Gnu Shogi v1.2p03. Available from many sources, including ftp://ftp.unipassau. de/pub/local/shogi (1994)
  15. 15.
    Yamashita, H.: YSS: About the Data Structures and the Algorithm. Published on the WWW at (1997)

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Donald F. Beal
    • 1
  • Martin C. Smith
    • 1
  1. 1.Department of Computer Science, Queen Mary and Westfield CollegeUniversity of LondonLondonEngland

Personalised recommendations