Advertisement

A Pumping Algorithm for Ergodic Stochastic Mean Payoff Games with Perfect Information

  • Endre Boros
  • Khaled Elbassioni
  • Vladimir Gurvich
  • Kazuhisa Makino
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6080)

Abstract

In this paper, we consider two-person zero-sum stochastic mean payoff games with perfect information, or BWR-games, given by a digraph G = (V = V B  ∪ V W  ∪ V R , E), with local rewards \(r: E \to {\mathbb R}\), and three types of vertices: black V B , white V W , and random V R . The game is played by two players, White and Black: When the play is at a white (black) vertex v, White (Black) selects an outgoing arc (v,u). When the play is at a random vertex v, a vertex u is picked with the given probability p(v,u). In all cases, Black pays White the value r(v,u). The play continues forever, and White aims to maximize (Black aims to minimize) the limiting mean (that is, average) payoff. It was recently shown in [7] that BWR-games are polynomially equivalent with the classical Gillette games, which include many well-known subclasses, such as cyclic games, simple stochastic games (SSG′s), stochastic parity games, and Markov decision processes. In this paper, we give a new algorithm for solving BWR-games in the ergodic case, that is when the optimal values do not depend on the initial position. Our algorithm solves a BWR-game by reducing it, using a potential transformation, to a canonical form in which the optimal strategies of both players and the value for every initial position are obvious, since a locally optimal move in it is optimal in the whole game. We show that this algorithm is pseudo-polynomial when the number of random nodes is constant. We also provide an almost matching lower bound on its running time, and show that this bound holds for a wider class of algorithms. Let us add that the general (non-ergodic) case is at least as hard as SSG′s, for which no pseudo-polynomial algorithm is known.

Keywords

mean payoff games local reward Gillette model perfect information potential stochastic games 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Andersson, D., Miltersen, P.B.: The complexity of solving stochastic games on graphs. In: Dong, Y., Du, D.-Z., Ibarra, O. (eds.) ISAAC 2009. LNCS, vol. 5878, pp. 112–121. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  2. 2.
    Beffara, E., Vorobyov, S.: Adapting Gurvich-Karzanov-Khachiyan’s algorithm for parity games: Implementation and experimentation. Technical Report 2001-020, Department of Information Technology, Uppsala University (2001), https://www.it.uu.se/research/reports/#2001
  3. 3.
    Beffara, E., Vorobyov, S.: Is randomized Gurvich-Karzanov-Khachiyan’s algorithm for parity games polynomial? Technical Report 2001-025, Department of Information Technology, Uppsala University (2001), https://www.it.uu.se/research/reports/#2001
  4. 4.
    Björklund, H., Sandberg, S., Vorobyov, S.: A combinatorial strongly sub-exponential strategy improvement algorithm for mean payoff games. DIMACS Technical Report 2004-05, DIMACS, Rutgers University (2004)Google Scholar
  5. 5.
    Björklund, H., Vorobyov, S.: Combinatorial structure and randomized subexponential algorithms for infinite games. Theoretical Computer Science 349(3), 347–360 (2005)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Björklund, H., Vorobyov, S.: A combinatorial strongly sub-exponential strategy improvement algorithm for mean payoff games. Discrete Applied Mathematics 155(2), 210–229 (2007)zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Boros, E., Elbassioni, K., Gurvich, V., Makino, K.: Every stochastic game with perfect information admits a canonical form. RRR-09-2009, RUTCOR. Rutgers University (2009)Google Scholar
  8. 8.
    Boros, E., Elbassioni, K., Gurvich, V., Makino, K.: A pumping algorithm for ergodic stochastic mean payoff games with perfect information. RRR-19-2009, RUTCOR. Rutgers University (2009)Google Scholar
  9. 9.
    Boros, E., Gurvich, V.: Why chess and back gammon can be solved in pure positional uniformly optimal strategies? RRR-21-2009, RUTCOR. Rutgers University (2009)Google Scholar
  10. 10.
    Chatterjee, K., Henzinger, T.A.: Reduction of stochastic parity to stochastic mean-payoff games. Inf. Process. Lett. 106(1), 1–7 (2008)zbMATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Chatterjee, K., Jurdziński, M., Henzinger, T.A.: Quantitative stochastic parity games. In: SODA ’04: Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 121–130. Society for Industrial and Applied Mathematics, Philadelphia (2004)Google Scholar
  12. 12.
    Condon, A.: The complexity of stochastic games. Information and Computation 96, 203–224 (1992)zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Condon, A.: An algorithm for simple stochastic games. In: Advances in computational complexity theory. DIMACS series in discrete mathematics and theoretical computer science, vol. 13 (1993)Google Scholar
  14. 14.
    Dhingra, V., Gaubert, S.: How to solve large scale deterministic games with mean payoff by policy iteration. In: Valuetools ’06: Proceedings of the 1st international conference on Performance evaluation methodolgies and tools, vol. 12. ACM, New York (2006)Google Scholar
  15. 15.
    Eherenfeucht, A., Mycielski, J.: Positional strategies for mean payoff games. International Journal of Game Theory 8, 109–113 (1979)CrossRefMathSciNetGoogle Scholar
  16. 16.
    Friedmann, O.: An exponential lower bound for the parity game strategy improvement algorithm as we know it. In: Symposium on Logic in Computer Science, pp. 145–156 (2009)Google Scholar
  17. 17.
    Gillette, D.: Stochastic games with zero stop probabilities. In: Dresher, M., Tucker, A.W., Wolfe, P. (eds.) Contribution to the Theory of Games III. Annals of Mathematics Studies, vol. 39, pp. 179–187. Princeton University Press, Princeton (1957)Google Scholar
  18. 18.
    Gimbert, H., Horn, F.: Simple stochastic games with few random vertices are easy to solve. In: Amadio, R.M. (ed.) FOSSACS 2008. LNCS, vol. 4962, pp. 5–19. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  19. 19.
    Gurvich, V., Karzanov, A., Khachiyan, L.: Cyclic games and an algorithm to find minimax cycle means in directed graphs. USSR Computational Mathematics and Mathematical Physics 28, 85–91 (1988)zbMATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Halman, N.: Simple stochastic games, parity games, mean payoff games and discounted payoff games are all LP-type problems. Algorithmica 49(1), 37–50 (2007)zbMATHCrossRefMathSciNetGoogle Scholar
  21. 21.
    Hoffman, A.J., Karp, R.M.: On nonterminating stochastic games. Management Science, Series A 12(5), 359–370 (1966)MathSciNetGoogle Scholar
  22. 22.
    Jurdziński, M.: Deciding the winner in parity games is in UP ∩ co-UP. Inf. Process. Lett. 68(3), 119–124 (1998)CrossRefGoogle Scholar
  23. 23.
    Jurdziński, M.: Games for Verification: Algorithmic Issues. PhD thesis, Faculty of Science, University of Aarhus, USA (2000)Google Scholar
  24. 24.
    Jurdziński, M., Paterson, M., Zwick, U.: A deterministic subexponential algorithm for solving parity games. In: SODA ’06: Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, pp. 117–123. ACM, New York (2006)CrossRefGoogle Scholar
  25. 25.
    Karp, R.M.: A characterization of the minimum cycle mean in a digraph. Discrete Math. 23, 309–311 (1978)zbMATHMathSciNetGoogle Scholar
  26. 26.
    Karzanov, A.V., Lebedev, V.N.: Cyclical games with prohibition. Mathematical Programming 60, 277–293 (1993)CrossRefMathSciNetGoogle Scholar
  27. 27.
    Kratsch, D., McConnell, R.M., Mehlhorn, K., Spinrad, J.P.: Certifying algorithms for recognizing interval graphs and permutation graphs. In: SODA ’03: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 158–167. Society for Industrial and Applied Mathematics, Philadelphia (2003)Google Scholar
  28. 28.
    Liggett, T.M., Lippman, S.A.: Stochastic games with perfect information and time-average payoff. SIAM Review 4, 604–607 (1969)CrossRefMathSciNetGoogle Scholar
  29. 29.
    Littman, M.L.: Algorithm for sequential decision making, CS-96-09. PhD thesis, Dept. of Computer Science, Brown Univ., USA (1996)Google Scholar
  30. 30.
    Mine, H., Osaki, S.: Markovian decision process. American Elsevier Publishing Co., New York (1970)Google Scholar
  31. 31.
    Moulin, H.: Extension of two person zero sum games. Journal of Mathematical Analysis and Application 5(2), 490–507 (1976)CrossRefMathSciNetGoogle Scholar
  32. 32.
    Moulin, H.: Prolongement des jeux à deux joueurs de somme nulle. Bull. Soc. Math. France, Memoire 45 (1976)Google Scholar
  33. 33.
    Pisaruk, N.N.: Mean cost cyclical games. Mathematics of Operations Research 24(4), 817–828 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  34. 34.
    Vöge, J., Jurdzinski, M.: A discrete strategy improvement algorithm for solving parity games. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 202–215. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  35. 35.
    Vorobyov, S.: Cyclic games and linear programming. Discrete Applied Mathematics 156(11), 2195–2231 (2008)zbMATHCrossRefMathSciNetGoogle Scholar
  36. 36.
    Zwick, U., Paterson, M.: The complexity of mean payoff games on graphs. Theoretical Computer Science 158(1-2), 343–359 (1996)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Endre Boros
    • 1
  • Khaled Elbassioni
    • 2
  • Vladimir Gurvich
    • 1
  • Kazuhisa Makino
    • 3
  1. 1.RUTCORRutgers UniversityPiscataway
  2. 2.Max-Planck-Institut für InformatikSaarbrückenGermany
  3. 3.Graduate School of Information Science and TechnologyUniversity of TokyoTokyoJapan

Personalised recommendations