Advertisement

Computational Intelligence in Mind Games

  • Jacek Mańdziuk
Part of the Studies in Computational Intelligence book series (SCI, volume 63)

Summary

The chapter considers recent achievements and perspectives of Computational Intelligence (CI) applied to mind games. Several notable examples of unguided, autonomous CI learning systems are presented and discussed. Based on advantages and limitations of existing approaches a list of challenging issues and open problems in the area of intelligent game playing is proposed and motivated.

It is generally concluded in the paper that the ultimate goal of CI in mind game research is the ability to mimic human approach to game playing in all its major aspects including learning methods (learning from scratch, multitask learning, unsupervised learning, pattern-based knowledge acquisition) as well as reasoning and decision making (efficient position estimation, abstraction and generalization of game features, autonomous development of evaluation functions, effective preordering of moves and selective, contextual search).

Key words

challenges CI in games game playing soft-computing methods Chess Checkers Go Othello Give-Away Checkers Backgammon Bridge Poker Scrabble 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    I. Aleksander. Neural networks- evolutionary checkers. Nature, 402(6764):857, 1999.CrossRefGoogle Scholar
  2. [2]
    T. Anantharaman and M. Campbell. Singular extensions: Adding selectivity to brute-force searching. Artificial Intelligence, 43:99-109, 1990.CrossRefGoogle Scholar
  3. [3]
    A. Arbiser. Towards the unification of intuitive and formal game concepts with applications to computer chess. In Proceedings of the Digital Games Research Conference 2005 (DIGRA’2005), Vancouver, B.C., Canada, 2005.Google Scholar
  4. [4]
    L. Barone and L. While. An adaptive learning model for simplified poker using evolutionary algorithms. In Proceedings of the Congress of Evolutionary Computation (GECCO-1999), pages 153-160, 1999.Google Scholar
  5. [5]
    L. Barone and L. While. Adaptive learning for poker. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 566-573, 2000.Google Scholar
  6. [6]
    J. Baxter, A. Tridgell, and L. Weaver. Experiments in parameter learning using temporal differences. ICCA Journal, 21(2):84-99, 1998.Google Scholar
  7. [7]
    J. Baxter, A. Tridgell, and L. Weaver. Knightcap: A chess program that learns by combining td(λ) with game-tree search. In Machine Learning, Proceedings of the Fifteenth International Conference (ICML ’98), pages 28-36, Madison Wisconsin, July 1998.Google Scholar
  8. [8]
    J. Baxter, A. Tridgell, and L. Weaver. Learning to play chess using temporal differences. Machine Learning, 40(3):243-263, 2000.zbMATHCrossRefGoogle Scholar
  9. [9]
    D. Beal. A generalised quiescence search algorithm. Artificial Intelligence, 43:85-98, 1990.CrossRefGoogle Scholar
  10. [10]
    D. F. Beal and M. C. Smith. Learning piece values using temporal differences. ICCA Journal, 20(3):147-151, 1997.Google Scholar
  11. [11]
    H. Berliner. The B* tree search algorithm: A best-first proof procedure. Artificial Intelligence, 12(1):23-40, 1979.CrossRefMathSciNetGoogle Scholar
  12. [12]
    D. Billings. Thoughts on RoShamBo. ICGA Journal, 23(1):3-8, 2000.MathSciNetGoogle Scholar
  13. [13]
    D. Billings, A. Davidson, J. Schaeffer, and D. Szafron. The challenge of poker. Artificial Intelligence, 134:201-240, 2002.zbMATHCrossRefGoogle Scholar
  14. [14]
    B. Bouzy and T. Cazenave. Computer Go: an AI oriented survey. Artificial Intelligence, 132(1):39-103, 2001.zbMATHCrossRefMathSciNetGoogle Scholar
  15. [15]
  16. [16]
    M. Buro. Probcut: An effective selective extension of the alpha-beta algorithm. ICCA Journal, 18(2):71-76, 1995.MathSciNetGoogle Scholar
  17. [17]
    M. Buro. From simple features to sophisticated evaluation functions. In H. J. van den Herik and H. Iida, editors, Proceedings of Computers and Games Conference (CG98), volume 1558 of Lecture Notes in Computer Science, pages 126 -145, Springer, Berlin, 1999.Google Scholar
  18. [18]
    M. Buro. Toward opening book learning. ICCA Journal, 22(2):98-102, 1999.Google Scholar
  19. [19]
    M. Buro. Improving heuristic minimax search by supervised learning. Artificial Intelligence, 134:85-99, 2002.zbMATHCrossRefGoogle Scholar
  20. [20]
    M. Campbell, A. J. Hoane Jr., and F.-h. Hsu. Deep Blue. Artificial Intelligence, 134:57-83, 2002.zbMATHCrossRefGoogle Scholar
  21. [21]
    R. Caruana. Multitask learning. Machine Learning, 28:41-75, 1997.CrossRefGoogle Scholar
  22. [22]
    K. Chellapilla and D. B. Fogel. Evolution, neural networks, games, and intelligence. Proceedings of the IEEE, 87(9):1471-1496, 1999.CrossRefGoogle Scholar
  23. [23]
    K. Chellapilla and D. B. Fogel. Evolving neural networks to play checkers without relying on expert knowledge. IEEE Transactions on Neural Networks, 10(6):1382-1391, 1999.CrossRefGoogle Scholar
  24. [24]
    K. Chellapilla and D. B. Fogel. Anaconda defeats Hoyle 6-0: A case study competing an evolved checkers program against commercially available software. In Congress on Evolutionary Computation, La Jolla, CA, USA, pages 857-863, 2000.Google Scholar
  25. [25]
    K. Chellapilla and D. B. Fogel. Evolving a neural network to play checkers without human expertise. In N. Baba and L. C. Jain, editors, Computational Intelligence in Games, volume 62, pages 39-56. Springer Verlag, Berlin, 2001.Google Scholar
  26. [26]
    P. Darwen and X. Yao. On evolving robust strategies for iterated prisoner’s dilemma. volume 956 of LNCS, pages 276-292. Springer, 1995.Google Scholar
  27. [27]
    A. D. de Groot. Thought and Choice in Chess. Mouton Publishers, The Hague, 1965.Google Scholar
  28. [28]
    A. Einstein. Cosmic Religion, with Other Opinions and Aphorisms. 1931.Google Scholar
  29. [29]
    M. Enzenberger. Evaluation in Go by a neural network using soft segmentation. In Advances in Computer Games: Many Games, Many Challenges: Proceedings of the International Conference on Advances in Computer Games (ACG-10), pages 97-108, Graz, Austria, 2003.Google Scholar
  30. [30]
    S. Epstein. Identifying the right reasons: Learning to filter decision makers. In R. Greiner and D. Subramanian, editors, Proceedings of the AAAI 1994 Fall Symposium on Relevance, pages 68-71, New Orleans, 1994. AAAI Press.Google Scholar
  31. [31]
    S. L. Epstein, J. Gelfand, and J. Lesniak. Pattern-based learning and spatially-oriented concept formation in a multi-agent, decision-making expert. Computational Intelligence, 12(1):199-221, 1996.CrossRefGoogle Scholar
  32. [32]
    T. E. Fawcett and P. E. Utgoff. Automatic feature generation for problem solving systems. In D. Sleeman and P. Edwards, editors, Proceedings of the 9th International Conference on Machine Learning, pages 144-153. Morgan Kaufmann, 1992.Google Scholar
  33. [33]
    D. B. Fogel. Blondie24: Playing at the Edge of Artificial Intelligence. Morgan Kaufmann, 2001.Google Scholar
  34. [34]
    D. B. Fogel, T. J. Hays, S. L. Hahn, and J. Quon. A self-learning evolutionary chess program. Proceedings of the IEEE, 92(12):1947-1954, 2004.CrossRefGoogle Scholar
  35. [35]
    J. Fürnkranz. Machine learning in computer chess: the next generation. ICGA Journal, 19(3):147-161, 1996.Google Scholar
  36. [36]
    M. Gherrity. A game-learning machine. PhD Thesis, University of California, San Diego, CA, 1993.Google Scholar
  37. [37]
  38. [38]
    M. L. Ginsberg. GIB: Steps toward an expert-level bridge-playing program. In International Joint Conference on Artificial Intelligence (IJCAI’99), pages 584-589, Stockholm, SWEDEN, 1999.Google Scholar
  39. [39]
    M. L. Ginsberg. GIB: Imperfect information in a computationally challenging game. Journal of Artificial Intelligence Research, 14:303-358, 2001.zbMATHGoogle Scholar
  40. [40]
    H. Givens. PokerProbot. http://www.pokerprobot.com/, 2006.
  41. [41]
    D. Gomboc, T. A. Marsland, and M. Buro. Evaluation function tuning via ordinal correlation. In Advances in Computer Games: Many Games, Many Challenges: Proceedings of the International Conference on Advances in Computer Games (ACG-10), pages 1-18, Graz, Austria, 2003.Google Scholar
  42. [42]
    J. Gould and R. Levinson. Experience-based adaptive search. In R. Michalski and G. Tecuci, editors, Machine Learning: A Multi-Strategy Approach, pages 579-604. Morgan Kaufmann, 1994.Google Scholar
  43. [43]
    K. Greer. Computer chess move-ordering schemes using move influence. Artificial Intelligence, 120:235-250, 2000.zbMATHCrossRefGoogle Scholar
  44. [44]
    E. A. Heinz. Adaptive null-move pruning. ICCA Journal, 22(3):123-132,1999.Google Scholar
  45. [45]
    F.-h. Hsu. Behind Deep Blue. Princeton University Press, Princeton, NJ, 2002.zbMATHGoogle Scholar
  46. [46]
    R. M. Hyatt. Crafty. ftp.cis.uab.edu/pub/hyatt, 2006.Google Scholar
  47. [47]
    R. M. Hyatt, H. L. Nelson, and A. E. Gower. Cray Blitz. In T. A. Marsland and J. Schaeffer, editors, Computers, Chess, and Cognition, pages 111-130. Springer Verlag, New York, 1990.Google Scholar
  48. [48]
    IBM Corporation. Deep Blue technology. http://www.research.ibm.com/know/blue.html, 2006.
  49. [49]
    L. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4:237-285, 1996.Google Scholar
  50. [50]
    G. Kendall and G. Whitwell. An evolutionary approach for the tuning of a chess evaluation function using population dynamics. In Proceedings of the 2001 Congress on Evolutionary Computation CEC2001, pages 995-1002. IEEE Press, 2001.Google Scholar
  51. [51]
    D. Kopec and I. Bratko. The Bratko-Kopec experiment: A comparison of human and computer performance in chess. In M. R. B. Clarke, editor, Advances on Computer Chess 3, pages 57-72. Pergamon Press, Oxford, 1982.Google Scholar
  52. [52]
    C. Kotnik and J. K. Kalita. The significance of temporal-difference learning in self-play training td-rummy versus evo-rummy. In T. Fawcett and N. Mishra, editors, Machine Learning, Proceedings of the Twentieth International Conference (ICML 2003), pages 369-375, Washington, DC, USA, August 2003. AAAI Press.Google Scholar
  53. [53]
    H. Kuijf. Jack - computer bridge playing program. http://www.jackbridge.com, 2006.
  54. [54]
    M. Kusiak, K. Waledzik, and J. Mandziuk. Evolution of heuristics for give-away checkers. In W. Duch et al., editors, Artificial Neural Networks: Formal Models and Their Applications - Proc. ICANN 2005, Part 2, Warszawa, Poland, volume 3697 of LNCS, pages 981-987. Springer, 2005.Google Scholar
  55. [55]
    R. Levinson. MORPH II: A universal agent: Progress report and proposal. Technical Report UCSC-CRL-94-22, Jack Baskin School of Engineering, Department of Computer Science, University of California, Santa Cruz, 1994.Google Scholar
  56. [56]
    R. A. Levinson and R. Snyder. Adaptive pattern-oriented chess. In L. Birnbaum and G. Collins, editors, Proceedings of the 8th International Workshop on Machine Learning, pages 85-89. Morgan Kaufmann, 1991.Google Scholar
  57. [57]
    A. Macleod. Perudo as a development platform for Artificial Intelligence. In 13th Game-On International Conference (CGAIDE’04), pages 268-272, Reading, UK, 2004.Google Scholar
  58. [58]
    A. Macleod. Perudo game. http://www.playperudo.com/, 2006.
  59. [59]
    J. Mandziuk. Incremental learning approach for board game playing agents. In Proceedings of the 2000 International Conference on Artificial Intelligence (IC-AI2000), volume 2, pages 705-711, Las Vegas, USA, 2000.Google Scholar
  60. [60]
    J. Mandziuk. Incremental training in game playing domain. In Proceedings of the International ICSC Congress on Intelligent Systems & Applications (ISA2000), volume 2, pages 18-23, Wollongong, Australia, 2000.Google Scholar
  61. [61]
    J. Mandziuk, M. Kusiak, and K. Waledzik. Evolutionary-based heuristic generators for checkers and give-away checkers. Expert Systems, 2007, (accepted ).Google Scholar
  62. [62]
    J. Mandziuk and K. Mossakowski. Looking inside neural networks trained to solve double-dummy bridge problems. In 5th Game-On International Conference on Computer Games: Artificial Intelligence, Design and Education (CGAIDE04), pages 182-186, Reading, UK, 2004.Google Scholar
  63. [63]
    J. Mandziuk and D. Osman. Temporal difference approach to playing give-away checkers. In L. Rutkowski et al., editors, 7th Int. Conf. on Art. Intell. and Soft Comp. (ICAISC 2004), Zakopane, Poland, volume 3070 of LNAI, pages 909-914. Springer, 2004.Google Scholar
  64. [64]
    J. Mandziuk and L. Shastri. Incremental Class Learning approach and its application to handwritten digit recognition. Information Sciences, 141(3-4):193-217, 2002.zbMATHCrossRefGoogle Scholar
  65. [65]
    D. McAllester. Conspiracy numbers for min-max search. Artificial Intelligence, 35:287-310, 1988.zbMATHCrossRefMathSciNetGoogle Scholar
  66. [66]
    J. McCarthy. Homepage of John McCarthy. http://www-formal.stanford.edu/jmc/reti.html, 1998.
  67. [67]
    M. L. Minsky. Steps towards artificial intelligence. In Proceedings of IRE, volume 49, pages 8-30, 1961.Google Scholar
  68. [68]
    T. M. Mitchell and S. Thrun. Explanation based learning: A comparison of symbolic and neural network approaches. In P. E. Utgoff, editor, Proceedings of the 10th International Conference on Machine Learning, pages 197-204, San Mateo, CA, 1993. Morgan Kaufmann.Google Scholar
  69. [69]
    D. E. Moriarty and R. Miikkulainen. Discovering complex othello strate-gies through evolutionary neural systems. Connection Science, 7(3):195-209,1995.CrossRefGoogle Scholar
  70. [70]
    K. Mossakowski and J. Mandziuk. Artificial neural networks for solving double dummy bridge problems. Lecture Notes in Artificial Intelligence, 3070:915-921, 2004.Google Scholar
  71. [71]
    K. Mossakowski and J. Mandziuk. Neural networks and the estimation of hands’ strength in contract bridge. In L. Rutkowski , editors, 8th International Conference on Artificial Intelligence and Soft Computing (ICAISC06), Lecture Notes in Artificial Intelligence, pages 1189-1198, Zakopane, POLAND, 2006.Google Scholar
  72. [72]
    M. Müller. Computer Go as a sum of local games: An application of combinatorial game theory. PhD Thesis, ETH Zürich, Switzerland, 1995.Google Scholar
  73. [73]
    M. Müller. Computer Go. Artificial Intelligence, 134:145-179, 2002.zbMATHCrossRefGoogle Scholar
  74. [74]
    A. Newell, J. C. Shaw, and H. A. Simon. Chess-playing programs and the problem of complexity. IBM Journal of Research and Development, 2(4):320-335, 1958.MathSciNetCrossRefGoogle Scholar
  75. [75]
    D. Osman and J. Mandziuk. Comparison of tdleaf(λ) and td(λ) learning in game playing domain. In N. R. Pal ., editors, 11th Int. Conf. on Neural Inf. Proc. (ICONIP 2004), Calcutta, India, volume 3316 of LNCS, pages 549-554. Springer, 2004.Google Scholar
  76. [76]
    A. Plaat, J. Schaeffer, W. Pijls, and A. de Bruin. Best-first fixed-depth minimax algorithms. Artificial Intelligence, 87(1-2):255-293, 1996.CrossRefMathSciNetGoogle Scholar
  77. [77]
    A. Plaat, J. Schaeffer, W. Pijls, and A. de Bruin. Exploiting graph properties of game trees. In 13th National Conference on Artificial Intelligence (AAAI-96), volume 1, pages 234-239, Menlo Park, CA, 1996.Google Scholar
  78. [78]
    E. A. Poe. Maelzel’s chess player. Southern Literary Messenger, (April), 1936.Google Scholar
  79. [79]
    J. B. Pollack, A. D. Blair, and M. Land. Coevolution of a backgammon player. In C. G. Langton and K. Shimokara, editors, Proceedings of the Fifth Artificial Life Conference, pages 92-98. MIT Press, 1997.Google Scholar
  80. [80]
    A. Reinefeld. An improvement to the scout tree-search algorithm. ICCA Journal, 6(4):4-14, 1983.Google Scholar
  81. [81]
    T. P. Runarsson and S. M. Lucas. Coevolution versus self-play temporal difference learning for acquiring position evaluation on small-board Go. IEEE Transactions on Evolutionary Computation, 9(6):628-640, 2005.CrossRefGoogle Scholar
  82. [82]
    A. L. Samuel. Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 3(3):210-229, 1959.Google Scholar
  83. [83]
    J. Schaeffer. The history heuristic and alpha-beta search enhancements in practice. IEEE PAMI, 11(11):1203-1212, 1989.Google Scholar
  84. [84]
    J. Schaeffer. One Jump Ahead: Challenging Human Supremacy in Checkers. New York: Springer-Verlag, 1997.Google Scholar
  85. [85]
    J. Schaeffer. Chinook. http://www.cs.ualberta.ca/∼chinook/, 2006.
  86. [86]
    J. Schaeffer. Poki-X. http://www.cs.ualberta.ca/∼ games/poker/, 2006.
  87. [87]
    J. Schaeffer, J. C. Culberson, N. Treloar, B. Knight, P. Lu, and D. Szafron. A world championship caliber checkers program. Artificial Intelligence, 53(2-3):273-289, 1992.CrossRefGoogle Scholar
  88. [88]
    J. Schaeffer, M. Hlynka, and V. Jussila. Temporal difference learning applied to a high-performance game-playing program. In International Joint Conference on Artificial Intelligence (IJCAI), pages 529-534,2001.Google Scholar
  89. [89]
    J. Schaeffer, R. Lake, P. Lu, and M. Bryant. Chinook: The world man-machine checkers champion. AI Magazine, 17(1):21-29, 1996.Google Scholar
  90. [90]
    N. N. Schraudolph, P. Dayan, and T. J. Sejnowski. Temporal difference learning of position evaluation in the game of go. In J. D. Cowan, G. Tesauro, and J. Alspector, editors, Advances in Neural Information Processing 6, pages 817-824. Morgan Kaufmann, San Francisco, 1994.Google Scholar
  91. [91]
    N. N. Schraudolph, P. Dayan, and T. J. Sejnowski. Learning to evaluate go positions via temporal difference methods. In N. Baba and L. C. Jain, editors, Computational Intelligence in Games, volume 62, pages 77-98. Springer Verlag, Berlin, 2001.Google Scholar
  92. [92]
    Y. G. Seo, S. B. Cho, and X. Yao. Exploiting coalition in co-evolutionary learning. In Proceedings of the 2000 Congress on Evolutionary Computation, volume 2, pages 1268-1275. IEEE Press, 2000.Google Scholar
  93. [93]
    C. E. Shannon. Programming a computer for playing chess. Philosophical Magazine, 41 (7th series)(314):256-275, 1950.zbMATHMathSciNetGoogle Scholar
  94. [94]
    B. Sheppard. World-championship-caliber scrabble. Artificial Intelligence, 134:241-275, 2002.zbMATHCrossRefGoogle Scholar
  95. [95]
    H. Simon. Making managenet decisions: The role of intuition and emotion. In Weston Agor, editor, Intuition in Organizations, pages 23-39. Sage Pubs., London, 1987.Google Scholar
  96. [96]
    D. Sklansky. Hold’Em Poker. Two Plus Two Publishing, Nevada, USA, 1996.Google Scholar
  97. [97]
    D. Sklansky and M. Malmuth. Hold’Em Poker for Advanced Players, 21st Century Edition. Two Plus Two Publishing, Nevada, USA, 2001.Google Scholar
  98. [98]
    B. Stilman. Liguistic Geometry. From search to construction. Kluwer Academic Publishers, Boston, Dordrecht, London, 2000.Google Scholar
  99. [99]
    G. Stockman. A minimax algorithm better than alfa-beta? Artificial Intelligence, 12(2):179-196, 1979.zbMATHCrossRefMathSciNetGoogle Scholar
  100. [100]
    R. Sutton. Learning to predict by the method of temporal differences. Machine Learning, 3:9-44, 1988.Google Scholar
  101. [101]
    R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998.Google Scholar
  102. [102]
    G. Tesauro. Neurogammon wins computer olympiad. Neural Computation, 1:321-323, 1989.CrossRefGoogle Scholar
  103. [103]
    G. Tesauro. Practical issues in Temporal Difference Learning. Machine Learning, 8:257-277, 1992.zbMATHGoogle Scholar
  104. [104]
    G. Tesauro. TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6(2):215-219, 1994.CrossRefGoogle Scholar
  105. [105]
    G. Tesauro. Temporal Difference Learning and TD-Gammon. Communications of the ACM, 38(3):58-68, March 1995.CrossRefGoogle Scholar
  106. [106]
    S. Thrun. Learning to play the game of chess. In G. Tesauro, D. Touretzky, and T. Leen, editors, Advances in Neural Information Processing Systems 7, pages 1069-1076. The MIT Press, Cambridge, MA, 1995.Google Scholar
  107. [107]
    S. Thrun. Explanation-Based Neural Network Learning: A Lifelong Learning Approach. Kluwer Academic Publishers, Boston, MA, 1996.zbMATHGoogle Scholar
  108. [108]
    S. Thrun and T. M. Mitchell. Learning one more thing. Technical report, Carnegie Mellon University, USA, CMU-CS-94-184, 1994.Google Scholar
  109. [109]
    W. Tunstall-Pedoe. Genetic algorithms optimizing evaluation functions. ICCA Journal, 14(3):119-128, 1991.Google Scholar
  110. [110]
    A. M. Turing. Digital computers applied to games. In B. V. Bowden, editor, Faster than thought: a symposium on digital computing machines, chapter 25. Pitman, London, UK, 1953.Google Scholar
  111. [111]
    P. E. Utgoff. Feature construction for game playing. In J. Fürnkranz and M. Kubat, editors, Machines that Learn to Play Games, pages 131-152. Nova Science Publishers, Huntington, NY, 2001.Google Scholar
  112. [112]
    A. van Tiggelen. Neural networks as a guide to opitimization. The chess middle game explored. ICCA Journal, 14(3):115-118, 1991.Google Scholar
  113. [113]
    T. Yoshioka, S. Ishii, and M. Ito. Strategy acquisition for the game “othello” based on reinforcement learning. IEICE Transactions on Information and Systems, E82-D(12):1618-1626, 1999.Google Scholar
  114. [114]
    A. Zorbist. Feature extractions and representation for pattern recognition and the game of go. PhD Thesis, University of Wisconsin, 1970.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Jacek Mańdziuk
    • 1
  1. 1.Faculty of Mathematics and Information ScienceWarsaw University of TechnologyWarsawPoland

Personalised recommendations