Abstract
Recently Dekker and Hordijk [3,4] introduced conditions for the existence of deterministic Blackwell optimal policies in denumerable Markov decision chains with unbounded rewards. These conditions includeΜ- uniform geometric recurrence.
TheΜ-uniform geometric recurrence property also implies the existence of average optimal policies, a solution to the average optimality equation with explicit formula's and convergence of the value iteration algorithm for average rewards. For this reason, the verification ofΜ-uniform geometric convergence is also useful in cases where average andα-discounted rewards are considered.
On the other hand,Μ-uniform geometric recurrence is a heavy condition on the Markov decision chain structure for negative dynamic programming problems. The verification ofΜ-uniform geometric recurrence for the Markov chain induced by some deterministic policy together with results by Sennott [14] yields the existence of a deterministic policy that minimizes the expected average cost for non-negative immediate cost functions.
In this paperΜ-uniform geometric recurrence will be proved for two queueing models: theK competing queues and the two centre open Jackson network with control of the service rates.
Similar content being viewed by others
References
J.S. Baras, D.-J. Ma and A.M. Makowski,K competing queues with geometric service requirements and linear costs: theΜc-rule is always optimal, Systems Control Lett. 6 (1985) 186–209.
C. Buyukkoc, P. Varaiya and J. Walrand, ThecΜ rule revisited, Adv. Appl. Prob. 17 (1985) 237–238.
R. Dekker and A. Hordijk, Average, sensitive and Blackwell optimal policies in denumerable Markov decision chains with unbounded rewards, Math. Operat. Res. 13 (1988) 395–421.
R. Dekker and A. Hordijk, Recurrence conditions for average and Blackwell optimality in denumerable state Markov decision chains, Technical report, Dep. of Math. and Comp. Sci., Univ. of Leiden (1989), to appear in Math. Operat. Res.
R. Dekker and A. Hordijk, Denumerable semi-Markov decision chains with small interest rates, this volume, pp. 185–212.
R. Dekker, A. Hordijk and F.M. Spieksma, On the relation between recurrence and ergodicity conditions in denumerable Markov decision chains, Technical report, Dep. of Math. and Comp. Sci., Univ. of Leiden, (1990) forthcoming.
A. Hordijk and F.M. Spieksma, On ergodicity and recurrence properties of a Markov chain with an application to an open Jackson network, Technical report, Dep. of Math. and Comp. Sci., Univ. of Leiden (1989), submitted for publication.
A. Hordijk and F.M. Spieksma, On the convergence of successive approximation under strong recurrence conditions, Technical report, Dep. of Math. and Comp. Sci., Univ. of Leiden, (1990) forthcoming.
D.G. Kendall, Geometric ergodicity and the theory of queues, in:Mathematical Methods in the Social Sciences, eds. K.J. Arrow, S. Karlin and P. Suppes (Stanford University Press, Stanford, 1960) pp. 176–195.
G.P. Klimov, Time-sharing service systems I, Th. Prob. Appl. 19 (1974) 532–551.
J.B. Lasserre, Conditions for existence of average and Blackwell optimal stationary policies in denumerable Markov decision process, J. Math. Anal. Appl. 136 (1988) 479–490.
A.M. Makowski and A. Shwartz, Recurrence properties of a discrete-time single-server network with random routing, EE PUB No. 718 (1989).
Ph. Nain, Interchange arguments for classical scheduling problems in queues, Systems Control Lett. 12 (1989) 177–184.
L.I. Sennott, Average cost optimal stationary policies in infinite state Markov decision processes with unbounded costs, Operat. Res. 37 (1989) 626–633.
L.I. Sennott, Average cost semi-Markov decision processes and the control of queueing systems, Prob. Eng. Inf. Sci. 2 (1989) 247–272.
F.M. Spieksma, Geometric ergodicity of the ALOHA-system and a coupled processors model, Technical report, Dep. of Math. & Comp. Sci., Univ. of Leiden (1990) forthcoming.
F.M. Spieksma, Geometrically ergodic Markov chains and the optimal control of queues, unpublished doctoral dissertation, Univ. of Leiden (1990)(available on request from the author).
Sh. Stidham Jr. and R.R. Weber, Monotonic and insensitive optimal policies for control of queues with undiscounted costs, Oper. Res. 87 (1989) 611–625.
R.R. Weber and Sh. Stidham Jr., Optimal control of service rates in networks of queues, Adv. Appl. Prob. 19 (1987) 202–218.
J. Wessels, Markov programming and successive approximations with respect to weighted supremum norms, J. Math. Anal. Appl. 58 (1977) 326–335.
H. Zijm, The optimality equations in multichain denumerable state Markov decision processes with the average cost criterion: the bounded cost case, Stat. Dec. 3 (1985) 143–165.
Author information
Authors and Affiliations
Additional information
The research of the author is supported by the Netherlands Organization for Scientific Research N.W.O.
Rights and permissions
About this article
Cite this article
Spieksma, F. The existence of sensitive optimal policies in two multi-dimensional queueing models. Ann Oper Res 28, 273–295 (1991). https://doi.org/10.1007/BF02055586
Issue Date:
DOI: https://doi.org/10.1007/BF02055586