Abstract
This paper considers two-person zero-sum sequential games with finite state and action spaces. We consider the pair of functional equations (f.e.) that arises in the undiscounted infinite stage model, and show that a certain class of successive approximation schemes is guaranteed to converge to a solution pair whenever an equilibrium policy with respect to the average return per unit time criterion (AEP) exists. Existence of the latter thus implies the existence of a solution to this pair of f.e. whereas the converse implication is shown only to hold under special circumstances.
In addition to this pair of f.e., a complete sequence of f.e. has to be considered when analyzing more sensitive optimality criteria that make further selections within the class of AEPs. A number of characterizations and interdependences between the existence of solutions to the f.e. and existence of stationary sensitive optimal equilibrium policies are obtained.
Zusammenfassung
Die Arbeit behandelt sequentielle Zweipersonen-Nullsummenspiele mit endlichem Zustands- und endlichem Aktionenraum. Es wird das Paar von Funktionalgleichungen für das unendlich-stufige Modell ohne Diskontierung betrachtet und gezeigt, daß eine gewisse Klasse von sukzessiven Approximationen gegen ein Lösungspaar konvergiert, wenn eine Gleichgewichtspolitik für den Fall existiert, daß als Kriterium die durchschnittliche Auszahlung pro Zeiteinheit gewählt wird. Werden empfindlichere Optimalitätskriterien betrachtet, so muß zusätzlich zu dem obigen Funktionalgleichungspaar eine ganze Folge von Funktionalgleichungen untersucht werden. Weiter werden Resultate über die Existenz von Lösungen der Funktionalgleichungen und die damit zusammenhängende Existenz stationärer optimaler Gleichgewichtspolitiken hergeleitet.
Similar content being viewed by others
References
Bather, J.: Optimal decision procedures for finite Markov Chains, Part II. Adv. Appl. Prob.5, 1973, 521–540.
Bewley, T., andE. Kohlberg: The asymptotic theory of Stochastic Games. Math. of O.R.1, 1976, 197–208.
—: On Stochastic Games with stationary optimal strategies. Math. of O.R.3, 1978, 104–126.
Denardo, E.: Markov Renewal Programs with small interest rates. Ann. of Math. Stat.42, 1971, 477–496.
Denardo, E., andB. Fox: Multichain Markov Renewal Programs. SIAM, J. Appl. Math.16, 1968, 468–487.
Federgruen, A.: OnN-person Stochastic Games with denumerable state space. Adv. Appl. Prob.10, 1978, 452–471.
—: Successive approximation methods in undiscounted sequential games. Op. Res.28, 1980, 794–810.
—: On the Functional equations in undiscounted and sensitive discounted stochastic games. Graduate School of Management Working Paper Series no. 7907. University of Rochester, Rochester, N.Y. 1979 (unabridged version of this paper).
Gillette, D.: Stochastic Games with zero stop probabilities. Contributions to the theory of games, Vol. III. Ed. by M. Dresher et al. Princeton, New Jersey, 1957, 179–188.
Hoffman, A., andR. Karp: On non-terminating Stochastic Games. Man. Sci.12, 1966, 359–370.
Hordijk, A., andH. Tijms: A modified form of the iterative method of Dynamic Programming. Ann. of Stat.3, 1975, 203–208.
Howard, R.: Dynamic Programming and Markov Processes. New York 1960.
Karlin, S.: Mathematical Methods and the Theory of Games, Vol. I. London 1959.
Miller, B., andA. Veinott, Jr.: Discrete Dynamic Programming with a small interest rate. Ann. Math. Stat.40, 1969, 366–370.
Mertens, Y., andA. Neyman: Stochastic games. CORE Research Working Paper, Heverlee, Belgium, 1980.
Monash, C.: Stochastic games, the minimax theorem. Ph.D. dissertation. Harvard University, Cambridge, Massachusetts, 1979.
Parthasarathy, T., andM. Stern: Markov Games a survey. University of Illinois, Chicago, 1976.
Rogers, P.: Nonzero-sum Stochastic Games. Report ORC 69-8, Op. Res. Center, Univ. of California, Berkeley, 1969.
Schweitzer, P.J.: Perturbation theory and finite Markov chains. J. Appl. Prob.5, 1968, 401–413.
—: Iterative Solution of the Functional Equations of undiscounted Markov Renewal Programming. J.M.A.A.34, 1971, 495–501.
Schweitzer, P.J., andA. Federgruen: Functional Equations of Undiscounted Markov Renewal Programming. Math. of O.R.3, 1978, 308–322.
Shapley, L.: Stochastic Games. Proc. Nat. Acad. Sci. U.S.A.39, 1953, 1095–1100.
Sladky, K.: On the set of optimal controls for Markov Chains with rewards. Kybernetika10, 1974, 350–367.
Sobel, M.: Noncooperative Stochastic Games. Ann. of Math. Stat.42, 1971, 1930–1935.
Stern, M.: On Stochastic Games with limiting average payoff. Ph.D. dissertation, Dept. of Math., Univ. of Illinois, Chicago Circle Campus, 1975.
Veinott, A., Jr.: Discrete Dynamic Programming with sensitive discount optimality criteria. Ann. Stat.40, 1969, 1635–1660.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Federgruen, A. On the functional equations in undiscounted and sensitive discounted stochastic games. Zeitschrift für Operations Research 24, 243–262 (1980). https://doi.org/10.1007/BF01919903
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01919903