Advertisement

The Global Landscape of Objective Functions for the Optimization of Shogi Piece Values with a Game-Tree Search

  • Kunihito Hoki
  • Tomoyuki Kaneko
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7168)

Abstract

The landscape of an objective function for supervised learning of evaluation functions is numerically investigated for a limited number of feature variables. Despite the importance of such learning methods, the properties of the objective function are still not well known because of its complicated dependence on millions of tree-search values. This paper shows that the objective function has multiple local minima and the global minimum point indicates reasonable feature values. Moreover, the function is continuous with a practically computable numerical accuracy. However, the function has non-partially differentiable points on the critical boundaries. It is shown that an existing iterative method is able to minimize the functions from random initial values with great stability, but it has the possibility to end up with a non-reasonable local minimum point if the initial random values are far from the desired values. Furthermore, the obtained minimum points are shown to form a funnel structure.

Keywords

Objective Function Loss Function Tree Search Minimum Point Supervise Learning 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Anantharaman, T.: Evaluation tuning for computer chess: Linear discriminant methods. ICCA Journal 20, 224–242 (1997)Google Scholar
  2. 2.
    Baxter, J., Tridgell, A., Weaver, L.: TDLeaf(λ) Combining temporal difference learning with game-tree search. In: Proceedings of the 9th Australian Conference on Neural Networks (ACNN 1998), Brisbane, Australia, pp. 168–172 (1999)Google Scholar
  3. 3.
    Baxter, J., Tridgell, A., Weaver, L.: Learning to play chess using temporal-differences. Machine Learning 40, 242–263 (2000)CrossRefGoogle Scholar
  4. 4.
    Beal, D.F., Smith, M.C.: Temporal difference learning applied to game playing and the results of application to shogi. Theoretical Computer Science 252, 105–119 (2001)MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Campbell, M., Joseph Hoane, J.A., Hsu, F.: Deep Blue. Artificial Intelligence 134, 57–83 (2002)zbMATHCrossRefGoogle Scholar
  6. 6.
    Conn, A.R., Scheinberg, K., Vicente, L.N.: Introduction to Derivative-Free Optimization. MPS-SIAM Series on Optimization. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2009)CrossRefGoogle Scholar
  7. 7.
    Fürnkranz, J.: Machine Learning in Games: A Survey. In: Fürnkranz, J., Kubat, M. (eds.) Machines that Learn to Play Games, pp. 11–59. Nova Science Publishers (2001)Google Scholar
  8. 8.
    Hoki, K., Kaneko, T.: Large-Scale Optimization of Evaluation Functions with Minimax Search (in preparation)Google Scholar
  9. 9.
    Hoki, K.: Bonanza – The Computer Shogi Program (2011) (in Japanese), http://www.geocities.jp/bonanzashogi/ (last access: 2011)
  10. 10.
    Hoki, K.: Optimal control of minimax search results to learn positional evaluation. In: Proceedings of the 11th Game Programming Workshop (GPW 2006), Hakone, Japan, pp. 78–83 (2006) (in Japanese)Google Scholar
  11. 11.
    Hyatt, R.: Crafty 23.4 (2010), ftp://ftp.cis.uab.edu/pub/hyatt
  12. 12.
    Kaneko, T.: Learning evaluation functions by comparison of sibling nodes. In: Proceedings of the 12th Game Programming Workshop (GPW 2007), Hakone, Japan, pp. 9–16 (2007) (in Japanese)Google Scholar
  13. 13.
    Knuth, D.E., Moor, R.W.: An Analysis of Alpha-Beta Pruning. Artificial Intelligence 13, 293–326 (1991)Google Scholar
  14. 14.
    Letouzey, F.: Fruit 2.1 (2005), http://arctrix.com/nas/chess/fruit
  15. 15.
    Marsland, T., Campbell, M.: Parallel Search of Strongly Ordered Game Trees. ACM Computing Survey 14, 533–551 (1982)CrossRefGoogle Scholar
  16. 16.
    Marsland, T.A.: Evaluation-Function Factors. ICCA Journal 8, 47–57 (1985)Google Scholar
  17. 17.
    Marsland, T.A., Member, S., Popowich, F.: Parallel game-tree search. IEEE Transactions on Pattern Analysis and Machine Intelligence 7, 442–452 (1985)CrossRefGoogle Scholar
  18. 18.
    Nocedal, J., Wright, S.: Numerical Optimization. Springer (2006)Google Scholar
  19. 19.
    Nowatzyk, A.: (2000), http://tim-mann.org/DTevaltune.txt (last access: 2010)
  20. 20.
    Romstad, T.: Stockfish 1.9.1 (2010), http://www.stockfishchess.com
  21. 21.
    Schaeffer, J., Hlynka, M., Jussila, V.: Temporal difference learning applied to a high-performance game-playing program. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence (IJCAI 2001), pp. 529–534. Morgan Kaufmann Publishers Inc., San Francisco (2001)Google Scholar
  22. 22.
    Shannon, C.E.: Programming a Computer for Playing Chess. Philosophical Magazine, Ser. 7 41(314) (1950)Google Scholar
  23. 23.
    Sun, W., Yuan, Y.-X.: Optimization Theory and Methods. Nonlinear Programming. Springer Science+Business Media, LLC (2006)Google Scholar
  24. 24.
    Tesauro, G.: Comparison training of chess evaluation functions. In: Furnkranz, J., Kumbat, M. (eds.) Machines that Learn to Play Games, pp. 117–130. Nova Science Publishers (2001)Google Scholar
  25. 25.
    Tesauro, G.: Programming backgammon using self-teaching neural nets. Artificial Intelligence 134, 181–199 (2002)zbMATHCrossRefGoogle Scholar
  26. 26.
    Veness, J., Silver, D., Uther, W., Blair, A.: Bootstrapping from game tree search. In: Bengio, Y., Schuurmans, D., Laerty, J., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems, Vancouver, BC, Canada, pp. 1937–1945 (2009)Google Scholar
  27. 27.
    Yamashita, H.: YSS 7.0 – data structures and algorithms (in Japanese), http://www32.ocn.ne.jp/~yss/book.html (last access: 2010)

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Kunihito Hoki
    • 1
  • Tomoyuki Kaneko
    • 2
  1. 1.Department of Communication Engineering and InformaticsThe University of Electro-CommunicationsTokyoJapan
  2. 2.Department of Graphics and Computer SciencesThe University of TokyoTokyoJapan

Personalised recommendations