Skip to main content

Analysis of Evaluation-Function Learning by Comparison of Sibling Nodes

  • Conference paper
Advances in Computer Games (ACG 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7168))

Included in the following conference series:

Abstract

This paper discusses gradients of search values with a parameter vector θ in an evaluation function. Recent learning methods for evaluation functions in computer shogi are based on minimization of an objective function with search results. The gradients of the evaluation function at the leaf position of a principal variation (PV) are used to make an easy substitution of the gradients of the search result. By analyzing the variations of the min-max value, we show (1) when the min-max value is partially differentiable and (2) how the substitution may introduce errors. Experiments on a shogi program with about a million parameters show how frequently such errors occur, as well as how effective the substitutions for parameter tuning are in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anantharaman, T.: Evaluation tuning for computer chess: Linear discriminant methods. ICCA Journal 20, 224–242 (1997)

    Google Scholar 

  2. Baxter, J., Tridgell, A., Weaver, L.: Learning to play chess using temporal-differences. Machine Learning 40, 242–263 (2000)

    Article  Google Scholar 

  3. Beal, D.F., Smith, M.C.: Temporal difference learning applied to game playing and the results of application to shogi. Theoretical Computer Science 252, 105–119 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  4. Buro, M.: Improving heuristic mini-max search by supervised learning. Artificial Intelligence 134, 85–99 (2002)

    Article  MATH  Google Scholar 

  5. Campbell, M., Hoane Jr., A.J., Hsu, F.H.: Deep blue. Artificial Intelligence 134, 57–83 (2002)

    Article  MATH  Google Scholar 

  6. Fawcett, T.E.: Feature Discovery for Problem Solving Systems. PhD thesis, Department of Computer Science, University of Massachusetts, Amherst (1993)

    Google Scholar 

  7. Fürnkranz, J.: Machine learning in games: a survey. In: Machines that Learn to Play Games, pp. 11–59. Nova Science Publishers, Commack (2001)

    Google Scholar 

  8. Hoki, K.: (2005) (in Japanese), http://www.geocities.jp/bonanza_shogi/

  9. Hoki, K.: Optimal control of minimax search results to learn positional evaluation. In: GPW 2006, pp. 78–83 (2006) (in Japanese)

    Google Scholar 

  10. Hoki, K., Kaneko, T.: Large-scale optimization of evaluation functions with minimax search (in preparation)

    Google Scholar 

  11. Iida, H., Sakuta, M., Rollason, J.: Computer shogi. Artificial Intelligence 134, 121–144 (2002)

    Article  MATH  Google Scholar 

  12. Kaneko, T.: Recent improvements on computer shogi and GPS-Shogi. Journal of Information Processing Society of Japan 50, 878–886 (2009) (in Japanese)

    Google Scholar 

  13. Marsland, T.: Evaluation function factors. ICCA Journal 8, 47–57 (1985)

    Google Scholar 

  14. Nowatzyk, A.: (2000), http://tim-mann.org/DT_eval_tune.txt

  15. Tanaka, T., Kaneko, T.: (2003), http://gps.tanaka.ecc.u-tokyo.ac.jp/gpsshogi/

  16. Tesauro, G.: Comparison training of chess evaluation functions. In: Machines that Learn to Play Games, pp. 117–130. Nova Science Publishers (2001)

    Google Scholar 

  17. Tesauro, G.: Programming backgammon using self-teaching neural nets. Artificial Intelligence 134, 181–199 (2002)

    Article  MATH  Google Scholar 

  18. Veness, J., Silver, D., Uther, W., Blair, A.: Bootstrapping from game tree search. In: Advances in Neural Information Processing Systems 22, pp. 1937–1945 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kaneko, T., Hoki, K. (2012). Analysis of Evaluation-Function Learning by Comparison of Sibling Nodes. In: van den Herik, H.J., Plaat, A. (eds) Advances in Computer Games. ACG 2011. Lecture Notes in Computer Science, vol 7168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31866-5_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31866-5_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31865-8

  • Online ISBN: 978-3-642-31866-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics