Abstract
Zero-sum stochastic games model situations where two persons, called players, control some dynamic system, and both have opposite objectives. One player wishes typically to minimize a cost which has to be paid to the other player. Such a game may also be used to model problems with a single controller who has only partial information on the system: the dynamic of the system may depend on some parameter that is unknown to the controller, and may vary in time in an unpredictable way. A worst-case criterion may be considered, where the unknown parameter is assumed to be chosen by “nature” (called player 1), and the objective of the controller (player 2) is then to design a policy that guarantees the best performance under worst-case behaviour of nature. The purpose of this paper is to present a survey of stochastic games in queues, where both tools and applications are considered. The first part is devoted to the tools. We present some existing tools for solving finite horizon and infinite horizon discounted Markov games with unbounded cost, and develop new ones that are typically applicable in queueing problems. We then present some new tools and theory of expected average cost stochastic games with unbounded cost. In the second part of the paper we present a survey on existing results on worst-case control of queues, and illustrate the structural properties of best policies of the controller, worst-case policies of nature, and of the value function. Using the theory developed in the first part of the paper, we extend some of the above results, which were known to hold for finite horizon costs or for the discounted cost, to the expected average cost.
Similar content being viewed by others
References
E. Altman, Flow control using the theory of zero-sum Markov games, IEEE Trans. Autom. Contr. 39 (1994) 814–818.
E. Altman, Monotonicity of optimal policies in a zero sum game: a flow control model, Adv. Dyn. Games and Appl. 1 (1994) 269–286.
E. Altman, A Markov game approach for optimal routing into a queueing network, INRIA Report No. 2178 (1994).
E. Altman, Non zero-sum stochastic games in admission, service and routing control in queueing systems, submitted to Queueing Systems.
E. Altman, A. Hordijk and F.M. Spieksma, Contraction conditions for average andα- discounted optimality in countable state Markov games with unbounded rewards, submitted to MOR.
E. Altman and G. Koole, Stochastic scheduling games with Markov decision arrival processes, J. Comp. Math. Appl. [3rd special issue on Differential Games] 26(6) (1993) 141–148.
E. Altman and N. Shimkin, Individually optimal dynamic routing in a processor sharing system: Stochastic game analysis, EE Publ. No. 849. (1992), submitted to Oper. Res.
E. Altman and N. Shimkin, Worst-case and Nash routing policies in parallel queues with uncertain service allocations, IMA Preprint Series No. 1120, Institute for Mathematics and Applications, University of Minnesota, Minneapolis, USA (1993), submitted to Oper. Res.
V. Borkar, Control of Markov chains with long-run average cost criterion,Proc. Stochastic Differential Systems, eds. Fleming and Lions (Springer, 1986) pp. 57–77.
V. Borkar and M.K. Ghosh, Denumerable state stochastic games with limiting average payoff, JOTA 76 (1993) 539–560.
R. Cavazos-Cadena, Recent results on conditions for the existence of average optimal stationary policies. Ann. Oper. Res. [special issue on Markov Decision Processes, eds. O. Hernandez-Lerma and J.B. Lasserre] 28 (1991) 3–26.
R. Dekker and A. Hordijk, Average, sensitive and Blackwell optimal policies in denumerable Markov decision chains with unbounded rewards, Math. Oper. Res. 13 (1988) 395–421.
R. Dekker, A. Hordijk and F.M. Spieksma, On the relation between recurrence and ergodicity properties in denumerable Markov decision chains, Math. Oper. Res. 19 (1994) 539–559.
D. Gillette, Stochastic games with zero stop probabilities,Contribution to the Theory of Games, III, eds. M. Dresher, A.W. Tucker and P. Wolfe (Princeton University Press, Princeton, 1957) pp. 179–187.
A. Glazer and R. Hassin, Stable priority purchasing in queues, Oper. Res. Lett. 6 (1986) 285–288.
B. Hajek, Optimal control of two interacting service stations, IEEE Trans. Autom. Contr. 29 (1984) 491–499.
R. Hassin and M. Haviv, Equilibrium strategies and the value of information in a two line queueing system with threshold jockeying, Commun. Stat. Stoch. Models 10 (1994) 415–435.
M. Haviv, Stable strategies for processor sharing systems, Euro. J. Oper. Res. 52 (1992) 103–106.
A. Hordijk,Dynamic Programming and Markov Potential Theory, 2nd Ed., Mathematical Centre Tracts 51 (Mathematisch Centrum, Amsterdam, 1977).
A. Hordijk and P.J. Holewijn, On the convergence of moments in stationary Markov chains, Stoch. Proc. Appl. 3 (1975) 55–64.
A. Hordijk and G. Koole, On the assignment of customers to parallel queues, Prob. Eng. Inf. Sci. 6 (1992) 495–511.
A. Hordijk and G. Koole, On the optimality of LEPT and μc rules for parallel processors and dependent arrival processes, Adv. Appl. Prob. 25 (1993) 979–997.
A. Hordijk and F.M. Spieksma, On ergodicity and recurrence properties of a Markov chain with an application to an open Jackson network, Adv. Appl. Prob. 24 (1992) 343–376.
M.T. Hsiao and A.A. Lazar, A game theoretic approach to decentralized flow control of Markovian queueing networks,Performance '87, eds. Courtois and Latouche (1988) pp. 55–73.
G. Koole, Stochastic scheduling and dynamic programming, Ph.D. Thesis, Leiden University (1992) (available on request from the author).
Y.A. Korilis and A. Lazar, On the existence of equilibria in noncooperative optimal flow control, to appear in J. ACM.
S.A. Lippman, Applying a new device in the optimization of exponential queueing systems, Oper. Res. 23 (1975) 687–710.
A.S. Nowak, On zero-sum stochastic games with general state space I, Prob. Math. Stat. 4, Fasc. 1 (1984) 13–32.
O. Passchier, Optimal service control against worst case admission policies, Preprint (1995).
T.E.S. Raghavan and J.A. Filar, Algorithms for stochastic games — A survey, ZOR 35 (1991) 437–472.
U. Rieder, Non-cooperative dynamic games with general utility functions,Stochastic Games and Related Topics, eds T. E. S. Raghavan et al. (Kluwer, 1991) pp. 161–174.
S.M. Ross,Stochastic Processes (Wiley, New York, 1983).
L.I. Sennott, Zero-sum stochastic games with unbounded costs: discounted and average cost cases, ZOR 40 (1994) 145–162.
F.M. Spieksma, Geometrically ergodic Markov chains and the optimal control of queues, Ph.D. Thesis, Leiden University (1990) (available on request from the author).
S. Stidham, Optimal control of admission, routing, and service in queues and networks of queues: a tutorial review,Proc. ARO Workshop: Analytic and Computational Issues in Logistics R and D, George Washington University (1984) pp. 330–377.
S. Stidham, Optimal control of admission to a queueing system, IEEE Trans. Autom. Contr. 30 (1985) 705–713.
J. Walrand,An Introduction to Queueing Networks (Prentice Hall, Cliffs, NJ, 1988).
R.R. Weber and S. Stidham, Optimal control of service rates in networks of queues, Adv. Appl. Prob. 19 (1987) 202–218.
J. Wessels, Markov Games with unbounded rewards,Dynamische Optimierung, ed. M. Schäl, Bonner Mathematische Schriften, Nr. 98 (1977).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Altman, E., Hordijk, A. Zero-sum Markov games and worst-case optimal control of queueing systems. Queueing Syst 21, 415–447 (1995). https://doi.org/10.1007/BF01149169
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01149169