Abstract
This paper studies denumerable state continuous-time controlled Markov chains with the discounted reward criterion and a Borel action space. The reward and transition rates are unbounded, and the reward rates are allowed to take positive or negative values. First, we present new conditions for a nonhomogeneous Q(t)-process to be regular. Then, using these conditions, we give a new set of mild hypotheses that ensure the existence of ∈-optimal (∈≥0) stationary policies. We also present a ‘martingale characterization’ of an optimal stationary policy. Our results are illustrated with controlled birth and death processes.
Similar content being viewed by others
References
Anderson, W. J.: Continuous-Time Markov Chains, Springer-Verlag, New York, 1991.
Bertsekas, D. P.: Dynamic Programming: Deterministic and Stochastic Models, Prentice-Hall, Englewood Cliffs, NJ, 1987.
Feinberg, E. A.: Continuous-time discounted jump Markov decision processes: A discrete-event approach, Preprint, 1998.
Feller, W.: On the integro-differential equations of purely discontinuous Markoff processes, Trans.Amer.Math.Soc. 48 (1940), 488–515.
Guo, X. P. and Hernández-Lerma, O.: Continuous-time controlled Markov chains, Ann.Appl. Probab. 13 (2003), 363–388.
Guo, X. P. and Zhu, W. P.: Denumerable-state continuous-time Markov decision processes with unbounded cost and transition rates under the discounted criterion, J.Appl.Probab. 39 (2002), 233–250.
Guo, X. P. and Zhu, W. P.: Denumerable state continuous time Markov decision processes with unbounded cost and transition rates under average criterion, ANZIAM J. 43 (2002), 541–557.
Haviv, M. and Puterman, M. L.: Bias optimality in controlled queuing systems, J.Appl.Probab. 35 (1998), 136–150.
Hernández-Lerma, O.: Lectures on Continuous-Time Markov Control Processes, Sociedad Matemática Mexicana, México city, 1994.
Hernández-Lerma, O. and Govindan, T. E.: Nonstationary continuous-time Markov control processes with discounted costs on infinite horizon, Acta Appl.Math. 67 (2001), 277–293.
Hernández-Lerma, O. and Lasserre, J. B.: Further Topics on Discrete-Time Markov Control Processes, Springer-Verlag, New York, 1999.
Hou, Z. T. and Guo, X. P.: Markov Decision Processes, Science and Technology Press of Hunan, Changsha, China, 1998 (in Chinese).
Kakumanu, P.: Continuously discounted Markov decision model with countable state and action spaces, Ann.Math.Statist. 42 (1971), 919–926.
Lefàvre, C.: Optimal control of a birth and death epidemic process, Oper.Res. 29 (1981), 971–982.
Lewis, M. E. and Puterman, M. L.: A note on bias optimality in controlled queueing systems, J.Appl.Probab. 37 (2000), 300–305.
Meyn, S. P. and Tweedie R. L.: Stability of Markovian processes III: Foster–Lyapunov criteria for continuous-time processes, Adv.Appl.Probab. 25 (1993), 518–548.
Miller, R. L.: Finite state continuous time Markov decision processes with an infinite planning horizon, J.Math.Anal.Appl. 22 (1968), 552–569.
Puterman, M.L.:Markov Decision Processes, Wiley, New York, 1994.
Sennott, L. I.: Stochastic Dynamic Programming and the Control of Queueing System, Wiley, New York, 1999.
Serfozo, R.: Optimal control of random walks, birth and death processes, and queues, Adv. Appl.Probab. 13 (1981), 61–83.
Song, J. S.: Continuous time Markov decision programming with non-uniformly bounded transition rates, Sci.Sin. 12 (1987), 1258–1267 (in Chinese).
Yin, G. G. and Zhang, Q.: Continuous-Time Markov Chains and Applications, Springer, New York, 1998.
Yushkevich, A. A. and Feinberg, E. A.: On homogeneous Markov model with continuous time and finite or countable state space, Theory Probab. Appl. 24 (1979), 156–161.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Guo, X., Hernández-Lerma, O. Continuous-Time Controlled Markov Chains with Discounted Rewards. Acta Applicandae Mathematicae 79, 195–216 (2003). https://doi.org/10.1023/B:ACAP.0000003675.06200.45
Issue Date:
DOI: https://doi.org/10.1023/B:ACAP.0000003675.06200.45