Abstract
This paper concerns countable state space Markov decision processes endowed with a (long-run expected)average reward criterion. For these models we summarize and, in some cases,extend some recent results on sufficient conditions to establish the existence of optimal stationary policies. The topics considered are the following: (i) the new assumptions introduced by Sennott in [20–23], (ii)necessary and sufficient conditions for the existence of a bounded solution to the optimality equation, and (iii) equivalence of average optimality criteria. Some problems are posed.
Similar content being viewed by others
References
R.B. Ash,Real Analysis and Probability (Academic Press, New York, 1972).
J.S. Baras, A.J. Dorsey and A.M. Makowski, Two competing queues with linear costs and geometric service requirements: The Μc-rule is often optimal, Adv. Appl. Prob. 17 (1985) 186–209.
V.S. Borkar, Controlled Markov chains and stochastic networks, SIAM J. Control Optim. 21 (1983) 652–666.
V.S. Borkar, On minimum cost per unit of time control of Markov chains, SIAM J. Control Optim. 22 (1984) 965–978.
V.S. Borkar, Control of Markov chains with long-run average cost criterion: The dynamic programming equations, SIAM J. Control Optim. 27 (1989) 965–978.
R. Cavazos-Cadena, Necessary and sufficient conditions for a bounded solution to the optimality equation in average reward Markov decision chains, Syst. Control Lett. 10 (1988) 71–78.
R. Cavazos-Cadena, Necessary conditions for the optimality equation in average-reward Markov decision processes, J.-Appl. Math. Optim. 19 (1989) 97–112.
R. Cavazos-Cadena, Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded costs, Kybernetika (Prague) 25 (1989) 145–156.
R. Cavazos-Cadena, Solution to the optimality equation in a class of Markov decision chains with the average cost criterion, Kybernetika (Prague) 27 (1991) 23–37.
R. Cavazos-Cadena and L.I. Sennott, Comparing recent assumptions for the existence of optimal stationary policies, submitted.
J. Dugundji,Topology (Allyn and Bacon, New York, 1960).
A. Federgruen, A. Hordijk and H.C. Tijms, A note on simultaneous recurrence conditions on a set of denumerable matrices, J. Appl. Prob. 15 (1978) 842–847.
A. Federgruen, P.J. Schweitzer and H.C. Tijms, Denumerable undiscounted semi-Markov decision processes with unbounded rewards, Math. Oper. Res. 8 (1983) 298–313.
O. Hernández-Lerma,Adaptive Markov Control Processes (Springer, New York, 1989).
D. Heyman and M. Sobel,Stochastic Models in Operations Research, vol. 2 (McGraw-Hill, New York, 1984).
K. Hinderer,Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameter, Lecture Notes in Operations Research 33 (Springer, New York, 1970).
A. Hordijk,Dynamic Programming and Potential Theory, Mathematical Centre Tracts 51, Amsterdam, The Netherlands (1974).
M. Loève,Probability Theory, vols. I and II (Springer, New York, 1977).
P. Nain and K.W. Ross, Optimal priority assignment with hard constraints, IEEE Trans. Auto. Control. AC-31 (1988) 883–888.
S.M. Ross,Applied Probability Models with Optimization Applications (Holden-Day, San Francisco, 1970).
L.I. Sennott, A new condition for the existence of optimum stationary policies in average cost Markov decision processes, Oper. Res. Lett. (1986) 17–23.
L.I. Sennott, A new condition for the existence of optimum stationary policies in average cost Markov decision processes-unbounded costs case,Proc. 25th IEEE Conf. on Decision and Control (1986) pp. 1719–1721.
L.I. Sennott, Average cost optimal stationary policies in infinite state Markov decision processes with unbounded costs, Oper. Res. 37 (1989) 626–633.
L.I. Sennott, Average cost semi-Markov decision processes and the control of queueing systems, Prob. Eng. Inf. Sci. 3 (1988) 247–272.
L.C. Thomas, Connectedness conditions for denumerable state Markov decision processes, in:Recent Developments in Markov Decision Processes, eds. R. Hartley, L.C. Thomas and D.J. White (Academic Press, New York, 1980) pp. 181–204.
R.R. Weber and S. Stidham Jr., Optimal control of service rates in network of queues, Adv. Appl. Prob. 19 (1987) 202–218.
Author information
Authors and Affiliations
Additional information
This research was partially supported by the Third World Academy of Sciences (TWAS) under Grant No. TWAS RG MP 898-152.
Rights and permissions
About this article
Cite this article
Cavazos-Cadena, R. Recent results on conditions for the existence of average optimal stationary policies. Ann Oper Res 28, 3–27 (1991). https://doi.org/10.1007/BF02055572
Issue Date:
DOI: https://doi.org/10.1007/BF02055572