Advertisement

First Results from Using Temporal Difference Learning in Shogi

  • Donald F. Beal
  • Martin C. Smith
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1558)

Abstract

This paper describes first results from the application of Temporal Difference learning [1] to shogi. We report on experiments to determine whether sensible values for shogi pieces can be obtained in the same manner as for western chess pieces [2]. The learning is obtained entirely from randomised self-play, without access to any form of expert knowledge. The piece values are used in a simple search program that chooses shogi moves from a shallow lookahead, using pieces values to evaluate the leaves, with a random tie-break at the top level. Temporal difference learning is used to adjust the piece values over the course of a series of games. The method is successful in learning values that perform well in matches against hand-crafted values.

Keywords

Learning Shogi Temporal Difference Minimax Search Gameplaying 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Sutton, R.S.: Learning to Predict by the Methods of Temporal Differences. Machine Learning 3 (1988) 9–44Google Scholar
  2. 2.
    Beal, D.F. and Smith, M.C.: Learning Piece Values Using Temporal Differences International Computer Chess Association Journal, Vol. 20, No. 3 (1997) 147–151Google Scholar
  3. 3.
    Levinson, R. and Snyder, R.: Adaptive Pattern Oriented Chess. Proceedings of AAAI-91, Morgan-Kaufman (1991) 601–605Google Scholar
  4. 4.
    Christensen, J. and Korf, R.: A Unified Theory of Heuristic Evaluation Functions and its Application to Learning.. AAAI-86, Morgan-Kaufman (1986) 148–152Google Scholar
  5. 5.
    Baxter, J., Tridgell, A. and Weaver, L.: KnightCap: A chess program that learns by combining TD(lambda) with game-tree search. In: Machine Learning, Proceedings of the Fifteenth International Conference (ICML’ 98), Madison (1998) 28–36Google Scholar
  6. 6.
    Fairbairn, J.: Shogi for Beginners. Ishi Press International (1989)Google Scholar
  7. 7.
    Leggett, T.: Shogi: Japan’s Game of Strategy. Charles E. Tuttle Company [Reprinted in 1993, first published in 1966]Google Scholar
  8. 8.
    Matsubara, H., Iida, H. and Grimbergen, R.: Natural Developments in Game Research: From Chess to Shogi to Go International Computer Chess Association Journal, Vol. 19, No. 2 (1996) 103–112Google Scholar
  9. 9.
    Tesauro, G.: Practical Issues in Temporal Difference Learning. Machine Learning 8 (1988) 9–44Google Scholar
  10. 10.
    Tesauro, G.: TD-Gammon, a Self-Teaching Backgammon Program, achieves Master Level Play. Neural Computation, Vol. 6, No. 2 (1994) 215–219CrossRefGoogle Scholar
  11. 11.
    Marsland, T.A.: Computer Chess and Search. In: Shapiro, S. (ed.) Encyclopaedia of Artificial Intelligence. 2nd edn. J. Wiley & Sons (1992)Google Scholar
  12. 12.
    Beal, D.F.: Experiments with the Null Move. In: Beal, D.F. (ed.) Advances in Computer Chess 5. Elsevier Science Publishers (1989) 65–79Google Scholar
  13. 13.
    Donninger, C.: Null Move and Deep Search: Selective Search Heuristics for Obtuse Chess Programs. International Computer Chess Association Journal, Vol. 16, No. 3 (1993) 137–143Google Scholar
  14. 14.
    Mutz, M.: Gnu Shogi v1.2p03. Available from many sources, including ftp://ftp.unipassau. de/pub/local/shogi (1994)
  15. 15.
    Yamashita, H.: YSS: About the Data Structures and the Algorithm. Published on the WWW at http://plaza15.mbn.or.jp/~yss (1997)

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Donald F. Beal
    • 1
  • Martin C. Smith
    • 1
  1. 1.Department of Computer Science, Queen Mary and Westfield CollegeUniversity of LondonLondonEngland

Personalised recommendations