Skip to main content
Log in

Zero-sum Markov games and worst-case optimal control of queueing systems

  • Published:
Queueing Systems Aims and scope Submit manuscript

Abstract

Zero-sum stochastic games model situations where two persons, called players, control some dynamic system, and both have opposite objectives. One player wishes typically to minimize a cost which has to be paid to the other player. Such a game may also be used to model problems with a single controller who has only partial information on the system: the dynamic of the system may depend on some parameter that is unknown to the controller, and may vary in time in an unpredictable way. A worst-case criterion may be considered, where the unknown parameter is assumed to be chosen by “nature” (called player 1), and the objective of the controller (player 2) is then to design a policy that guarantees the best performance under worst-case behaviour of nature. The purpose of this paper is to present a survey of stochastic games in queues, where both tools and applications are considered. The first part is devoted to the tools. We present some existing tools for solving finite horizon and infinite horizon discounted Markov games with unbounded cost, and develop new ones that are typically applicable in queueing problems. We then present some new tools and theory of expected average cost stochastic games with unbounded cost. In the second part of the paper we present a survey on existing results on worst-case control of queues, and illustrate the structural properties of best policies of the controller, worst-case policies of nature, and of the value function. Using the theory developed in the first part of the paper, we extend some of the above results, which were known to hold for finite horizon costs or for the discounted cost, to the expected average cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. E. Altman, Flow control using the theory of zero-sum Markov games, IEEE Trans. Autom. Contr. 39 (1994) 814–818.

    Google Scholar 

  2. E. Altman, Monotonicity of optimal policies in a zero sum game: a flow control model, Adv. Dyn. Games and Appl. 1 (1994) 269–286.

    Google Scholar 

  3. E. Altman, A Markov game approach for optimal routing into a queueing network, INRIA Report No. 2178 (1994).

  4. E. Altman, Non zero-sum stochastic games in admission, service and routing control in queueing systems, submitted to Queueing Systems.

  5. E. Altman, A. Hordijk and F.M. Spieksma, Contraction conditions for average andα- discounted optimality in countable state Markov games with unbounded rewards, submitted to MOR.

  6. E. Altman and G. Koole, Stochastic scheduling games with Markov decision arrival processes, J. Comp. Math. Appl. [3rd special issue on Differential Games] 26(6) (1993) 141–148.

    Google Scholar 

  7. E. Altman and N. Shimkin, Individually optimal dynamic routing in a processor sharing system: Stochastic game analysis, EE Publ. No. 849. (1992), submitted to Oper. Res.

  8. E. Altman and N. Shimkin, Worst-case and Nash routing policies in parallel queues with uncertain service allocations, IMA Preprint Series No. 1120, Institute for Mathematics and Applications, University of Minnesota, Minneapolis, USA (1993), submitted to Oper. Res.

    Google Scholar 

  9. V. Borkar, Control of Markov chains with long-run average cost criterion,Proc. Stochastic Differential Systems, eds. Fleming and Lions (Springer, 1986) pp. 57–77.

  10. V. Borkar and M.K. Ghosh, Denumerable state stochastic games with limiting average payoff, JOTA 76 (1993) 539–560.

    Google Scholar 

  11. R. Cavazos-Cadena, Recent results on conditions for the existence of average optimal stationary policies. Ann. Oper. Res. [special issue on Markov Decision Processes, eds. O. Hernandez-Lerma and J.B. Lasserre] 28 (1991) 3–26.

  12. R. Dekker and A. Hordijk, Average, sensitive and Blackwell optimal policies in denumerable Markov decision chains with unbounded rewards, Math. Oper. Res. 13 (1988) 395–421.

    Google Scholar 

  13. R. Dekker, A. Hordijk and F.M. Spieksma, On the relation between recurrence and ergodicity properties in denumerable Markov decision chains, Math. Oper. Res. 19 (1994) 539–559.

    Google Scholar 

  14. D. Gillette, Stochastic games with zero stop probabilities,Contribution to the Theory of Games, III, eds. M. Dresher, A.W. Tucker and P. Wolfe (Princeton University Press, Princeton, 1957) pp. 179–187.

    Google Scholar 

  15. A. Glazer and R. Hassin, Stable priority purchasing in queues, Oper. Res. Lett. 6 (1986) 285–288.

    Google Scholar 

  16. B. Hajek, Optimal control of two interacting service stations, IEEE Trans. Autom. Contr. 29 (1984) 491–499.

    Google Scholar 

  17. R. Hassin and M. Haviv, Equilibrium strategies and the value of information in a two line queueing system with threshold jockeying, Commun. Stat. Stoch. Models 10 (1994) 415–435.

    Google Scholar 

  18. M. Haviv, Stable strategies for processor sharing systems, Euro. J. Oper. Res. 52 (1992) 103–106.

    Google Scholar 

  19. A. Hordijk,Dynamic Programming and Markov Potential Theory, 2nd Ed., Mathematical Centre Tracts 51 (Mathematisch Centrum, Amsterdam, 1977).

    Google Scholar 

  20. A. Hordijk and P.J. Holewijn, On the convergence of moments in stationary Markov chains, Stoch. Proc. Appl. 3 (1975) 55–64.

    Google Scholar 

  21. A. Hordijk and G. Koole, On the assignment of customers to parallel queues, Prob. Eng. Inf. Sci. 6 (1992) 495–511.

    Google Scholar 

  22. A. Hordijk and G. Koole, On the optimality of LEPT and μc rules for parallel processors and dependent arrival processes, Adv. Appl. Prob. 25 (1993) 979–997.

    Google Scholar 

  23. A. Hordijk and F.M. Spieksma, On ergodicity and recurrence properties of a Markov chain with an application to an open Jackson network, Adv. Appl. Prob. 24 (1992) 343–376.

    Google Scholar 

  24. M.T. Hsiao and A.A. Lazar, A game theoretic approach to decentralized flow control of Markovian queueing networks,Performance '87, eds. Courtois and Latouche (1988) pp. 55–73.

  25. G. Koole, Stochastic scheduling and dynamic programming, Ph.D. Thesis, Leiden University (1992) (available on request from the author).

  26. Y.A. Korilis and A. Lazar, On the existence of equilibria in noncooperative optimal flow control, to appear in J. ACM.

  27. S.A. Lippman, Applying a new device in the optimization of exponential queueing systems, Oper. Res. 23 (1975) 687–710.

    Google Scholar 

  28. A.S. Nowak, On zero-sum stochastic games with general state space I, Prob. Math. Stat. 4, Fasc. 1 (1984) 13–32.

  29. O. Passchier, Optimal service control against worst case admission policies, Preprint (1995).

  30. T.E.S. Raghavan and J.A. Filar, Algorithms for stochastic games — A survey, ZOR 35 (1991) 437–472.

    Google Scholar 

  31. U. Rieder, Non-cooperative dynamic games with general utility functions,Stochastic Games and Related Topics, eds T. E. S. Raghavan et al. (Kluwer, 1991) pp. 161–174.

  32. S.M. Ross,Stochastic Processes (Wiley, New York, 1983).

    Google Scholar 

  33. L.I. Sennott, Zero-sum stochastic games with unbounded costs: discounted and average cost cases, ZOR 40 (1994) 145–162.

    Google Scholar 

  34. F.M. Spieksma, Geometrically ergodic Markov chains and the optimal control of queues, Ph.D. Thesis, Leiden University (1990) (available on request from the author).

  35. S. Stidham, Optimal control of admission, routing, and service in queues and networks of queues: a tutorial review,Proc. ARO Workshop: Analytic and Computational Issues in Logistics R and D, George Washington University (1984) pp. 330–377.

  36. S. Stidham, Optimal control of admission to a queueing system, IEEE Trans. Autom. Contr. 30 (1985) 705–713.

    Google Scholar 

  37. J. Walrand,An Introduction to Queueing Networks (Prentice Hall, Cliffs, NJ, 1988).

    Google Scholar 

  38. R.R. Weber and S. Stidham, Optimal control of service rates in networks of queues, Adv. Appl. Prob. 19 (1987) 202–218.

    Google Scholar 

  39. J. Wessels, Markov Games with unbounded rewards,Dynamische Optimierung, ed. M. Schäl, Bonner Mathematische Schriften, Nr. 98 (1977).

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Altman, E., Hordijk, A. Zero-sum Markov games and worst-case optimal control of queueing systems. Queueing Syst 21, 415–447 (1995). https://doi.org/10.1007/BF01149169

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01149169

Keywords

Navigation