Mathematical Programming

, Volume 59, Issue 1–3, pp 249–259

# A finite step algorithm via a bimatrix game to a single controller non-zero sum stochastic game

• A. S. Nowak
• T. E. S. Raghavan
Article

## Abstract

Given a non-zero sum discounted stochastic game with finitely many states and actions one can form a bimatrix game whose pure strategies are the pure stationary strategies of the players and whose penalty payoffs consist of the total discounted costs over all states at any pure stationary pair. It is shown that any Nash equilibrium point of this bimatrix game can be used to find a Nash equilibrium point of the stochastic game whenever the law of motion is controlled by one player. The theorem is extended to undiscounted stochastic games with irreducible transitions when the law of motion is controlled by one player. Examples are worked out to illustrate the algorithm proposed.

### Key words

Stochastic game theory

## Preview

### References

1. [1]
D. Blackwell, “Discrete dynamic programming,”Annals of Mathematical Statistics 33 (1962) 719–726.Google Scholar
2. [2]
J.A. Filar, “Ordered field property for stochastic games when the player who controls transition changes from state to state,”Journal of Optimization Theory and Applications 34 (1981) 503–513.Google Scholar
3. [3]
J.A. Filar, “On stationary equilibria of a single-controller stochastic game,”Mathematical Programming 30 (1984) 313–325.Google Scholar
4. [4]
J.A. Filar and T.E.S. Raghavan, “A matrix game solution of a single controller stochastic game,”Mathematics of Operations Research 9 (1984) 356–362.Google Scholar
5. [5]
A.M. Fink, “Equilibrium in a stochasticn-person game,”Journal of Science of Hiroshima University, Series, A-I 28 (1964) 89–93.Google Scholar
6. [6]
D. Gillette, “Stochastic games with zero stop probabilities,” in:Contributions to the Theory of Games III. Annals of Mathematical Studies No. 39 (Princeton University Press, Princeton, NJ, 1957) pp. 179–187.Google Scholar
7. [7]
A. Hordijk and L.C.M. Kallenberg, “Linear programming and Markov games I, II,” in: O. Moeschlin and D. Pallaschke, eds.,Game Theory and Mathematical Economics (North-Holland, Amsterdam, 1981).Google Scholar
8. [8]
T. Parthasarathy and T.E.S. Raghavan,Some Topics in Two-Person Games (American Elsevier Publishing Corporation, New York, 1971).Google Scholar
9. [9]
T. Parthasarathy and T.E.S. Raghavan, “An orderfield property for stochastic games when one player controls transition probabilities,”Journal of Optimization Theory and Applications 33 (1981) 375–392.Google Scholar
10. [10]
T.E.S. Raghavan and J.A. Filar, “Algorithms for stochastic games—a survey,” to appear in:Zeitschrift fur Operations Research. Google Scholar
11. [11]
S.M. Ross, “Non-discounted denumerable Markovian decision models,”Annals of Mathematical Statistics 39 (1968) 412–423.Google Scholar
12. [12]
L.S. Shapley, “Stochastic games,”Proceedings of the National Academy of Sciences of the U.S.A. 39 (1953) 1095–1100.Google Scholar
13. [13]
M.J. Sobel, “Noncooperative stochastic games,”Annals of Mathematical Statistics 42 (1971) 1930–1935.Google Scholar
14. [14]
M. Takahashi, “Equilibrium points of stochastic noncooperativen-person games,”Journal of Science of Hiroshima University, Series A-I 28 (1964) 95–99.Google Scholar
15. [15]
O.J. Vrieze, “Linear programming and undiscounted stochastic games in which one player controls transitions,”OR Spektrum 3 (1981) 29–35.Google Scholar

© The Mathematical Programming Society, Inc. 1993

## Authors and Affiliations

• A. S. Nowak
• 1
• T. E. S. Raghavan
• 2
1. 1.Institute of MathematicsWrocław Technical UniversityWrocławPoland
2. 2.Department of Mathematics, Statistics, and Computer ScienceUniversity of Illinois at ChicagoUSA