Advertisement

Gradient-Based Algorithms for Finding Nash Equilibria in Extensive Form Games

  • Andrew Gilpin
  • Samid Hoda
  • Javier Peña
  • Tuomas Sandholm
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4858)

Abstract

We present a computational approach to the saddle-point formulation for the Nash equilibria of two-person, zero-sum sequential games of imperfect information. The algorithm is a first-order gradient method based on modern smoothing techniques for non-smooth convex optimization. The algorithm requires O(1/ε) iterations to compute an ε-equilibrium, and the work per iteration is extremely low. These features enable us to find approximate Nash equilibria for sequential games with a tree representation of about 1010 nodes. This is three orders of magnitude larger than what previous algorithms can handle. We present two heuristic improvements to the basic algorithm and demonstrate their efficacy on a range of real-world games. Furthermore, we demonstrate how the algorithm can be customized to a specific class of problems with enormous memory savings.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Romanovskii, I.: Reduction of a game with complete memory to a matrix game. Soviet Mathematics 3, 678–681 (1962)Google Scholar
  2. 2.
    Koller, D., Megiddo, N.: The complexity of two-person zero-sum games in extensive form. Games and Economic Behavior 4(4), 528–552 (1992)zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    von Stengel, B.: Efficient computation of behavior strategies. Games and Economic Behavior 14(2), 220–246 (1996)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Koller, D., Pfeffer, A.: Representations and solutions for game-theoretic problems. Artificial Intelligence 94(1), 167–215 (1997) (Early version appeared in IJCAI-95)Google Scholar
  5. 5.
    Shi, J., Littman, M.: Abstraction methods for game theoretic poker. In: Computers and Games, Springer-Verlag, pp. 333–345. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  6. 6.
    Gilpin, A., Sandholm, T.: Lossless abstraction method for sequential games of imperfect information. Journal of the ACM (to appear) Early version appeared as Finding equilibria in large sequential games of imperfect information. In: Proceedings of the ACM Conference on Electronic Commerce (ACM-EC), Ann Arbor, MI, 2006 (2007)Google Scholar
  7. 7.
    Billings, D., Burch, N., Davidson, A., Holte, R., Schaeffer, J., Schauenberg, T., Szafron, D.: Approximating game-theoretic optimal strategies for full-scale poker. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI), Acapulco, Mexico, pp. 661–668 (2003)Google Scholar
  8. 8.
    Gilpin, A., Sandholm, T.: A competitive Texas Hold’em poker player via automated abstraction and real-time equilibrium computation. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), Boston, MA (2006)Google Scholar
  9. 9.
    Gilpin, A., Sandholm, T.: Better automated abstraction techniques for imperfect information games, with application to Texas Hold’em poker. In: International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Honolulu, HI (2007)Google Scholar
  10. 10.
    Gilpin, A., Sandholm, T., Sørensen, T.B.: Potential-aware automated abstraction of sequential games, and holistic equilibrium analysis of Texas Hold’em poker. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), Vancouver, BC, Canada (2007)Google Scholar
  11. 11.
    Lipton, R.J., Young, N.E.: Simple strategies for large zero-sum games with applications to complexity theory. In: Proceedings of the Annual Symposium on Theory of Computing (STOC), Montreal, Quebec, Canada, pp. 734–740 (1994)Google Scholar
  12. 12.
    Lipton, R., Markakis, E., Mehta, A.: Playing large games using simple strategies. In: Proceedings of the ACM Conference on Electronic Commerce (ACM-EC), pp. 36–41. ACM Press, New York (2003)Google Scholar
  13. 13.
    Daskalakis, C., Mehta, A., Papadimitriou, C.: A note on approximate Nash equilibria. In: Spirakis, P.G., Mavronicolas, M., Kontogiannis, S.C. (eds.) WINE 2006. LNCS, vol. 4286, Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Daskalakis, C., Mehta, A., Papadimitriou, C.: Progress in approximate Nash equilibria. In: Proceedings of the ACM Conference on Electronic Commerce (ACM-EC), pp. 355–358. ACM Press, New York (2007)Google Scholar
  15. 15.
    Feder, T., Nazerzadeh, H., Saberi, A.: Approximating Nash equilibria using small-support strategies. In: Proceedings of the ACM Conference on Electronic Commerce (ACM-EC), pp. 352–354. ACM Press, New York (2007)Google Scholar
  16. 16.
    Freund, Y., Schapire, R.: Adaptive game playing using multiplicative weights. Games and Economic Behavior 29, 79–103 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Nesterov, Y.: Excessive gap technique in nonsmooth convex minimization. SIAM Journal of Optimization 16(1), 235–249 (2005)zbMATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Academic Publishers, Dordrecht (2004)zbMATHGoogle Scholar
  19. 19.
    Lu, Z., Nemirovski, A., Monteiro, R.D.C.: Large-scale semidefinite programming via a saddle point mirror-prox algorithm. Mathematical Programming, Series B 109(2–3), 211–237 (2007)zbMATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Chudak, F.A., Eleutério, V.: Improved approximation schemes for linear programming relaxations of combinatorial optimization problems. In: Jünger, M., Kaibel, V. (eds.) Integer Programming and Combinatorial Optimization. LNCS, vol. 3509, pp. 81–96. Springer, Heidelberg (2005)Google Scholar
  21. 21.
    Hoda, S., Gilpin, A.: Peña, J.: A gradient-based approach for computing Nash equilibria of large sequential games (2007), Available at, http://www.optimization-online.org/
  22. 22.
    Billings, D., Davidson, A., Schaeffer, J., Szafron, D.: The challenge of poker. Artificial Intelligence 134(1-2), 201–240 (2002)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Andrew Gilpin
    • 1
  • Samid Hoda
    • 2
  • Javier Peña
    • 2
  • Tuomas Sandholm
    • 1
  1. 1.Computer Science Department, Carnegie Mellon University 
  2. 2.Tepper School of Business, Carnegie Mellon University 

Personalised recommendations