Skip to main content
Log in

Continuous-Time Controlled Markov Chains with Discounted Rewards

  • Published:
Acta Applicandae Mathematica Aims and scope Submit manuscript

Abstract

This paper studies denumerable state continuous-time controlled Markov chains with the discounted reward criterion and a Borel action space. The reward and transition rates are unbounded, and the reward rates are allowed to take positive or negative values. First, we present new conditions for a nonhomogeneous Q(t)-process to be regular. Then, using these conditions, we give a new set of mild hypotheses that ensure the existence of ∈-optimal (∈≥0) stationary policies. We also present a ‘martingale characterization’ of an optimal stationary policy. Our results are illustrated with controlled birth and death processes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Anderson, W. J.: Continuous-Time Markov Chains, Springer-Verlag, New York, 1991.

    Google Scholar 

  2. Bertsekas, D. P.: Dynamic Programming: Deterministic and Stochastic Models, Prentice-Hall, Englewood Cliffs, NJ, 1987.

    Google Scholar 

  3. Feinberg, E. A.: Continuous-time discounted jump Markov decision processes: A discrete-event approach, Preprint, 1998.

  4. Feller, W.: On the integro-differential equations of purely discontinuous Markoff processes, Trans.Amer.Math.Soc. 48 (1940), 488–515.

    Google Scholar 

  5. Guo, X. P. and Hernández-Lerma, O.: Continuous-time controlled Markov chains, Ann.Appl. Probab. 13 (2003), 363–388.

    Google Scholar 

  6. Guo, X. P. and Zhu, W. P.: Denumerable-state continuous-time Markov decision processes with unbounded cost and transition rates under the discounted criterion, J.Appl.Probab. 39 (2002), 233–250.

    Google Scholar 

  7. Guo, X. P. and Zhu, W. P.: Denumerable state continuous time Markov decision processes with unbounded cost and transition rates under average criterion, ANZIAM J. 43 (2002), 541–557.

    Google Scholar 

  8. Haviv, M. and Puterman, M. L.: Bias optimality in controlled queuing systems, J.Appl.Probab. 35 (1998), 136–150.

    Google Scholar 

  9. Hernández-Lerma, O.: Lectures on Continuous-Time Markov Control Processes, Sociedad Matemática Mexicana, México city, 1994.

    Google Scholar 

  10. Hernández-Lerma, O. and Govindan, T. E.: Nonstationary continuous-time Markov control processes with discounted costs on infinite horizon, Acta Appl.Math. 67 (2001), 277–293.

    Google Scholar 

  11. Hernández-Lerma, O. and Lasserre, J. B.: Further Topics on Discrete-Time Markov Control Processes, Springer-Verlag, New York, 1999.

    Google Scholar 

  12. Hou, Z. T. and Guo, X. P.: Markov Decision Processes, Science and Technology Press of Hunan, Changsha, China, 1998 (in Chinese).

    Google Scholar 

  13. Kakumanu, P.: Continuously discounted Markov decision model with countable state and action spaces, Ann.Math.Statist. 42 (1971), 919–926.

    Google Scholar 

  14. Lefàvre, C.: Optimal control of a birth and death epidemic process, Oper.Res. 29 (1981), 971–982.

    Google Scholar 

  15. Lewis, M. E. and Puterman, M. L.: A note on bias optimality in controlled queueing systems, J.Appl.Probab. 37 (2000), 300–305.

    Google Scholar 

  16. Meyn, S. P. and Tweedie R. L.: Stability of Markovian processes III: Foster–Lyapunov criteria for continuous-time processes, Adv.Appl.Probab. 25 (1993), 518–548.

    Google Scholar 

  17. Miller, R. L.: Finite state continuous time Markov decision processes with an infinite planning horizon, J.Math.Anal.Appl. 22 (1968), 552–569.

    Google Scholar 

  18. Puterman, M.L.:Markov Decision Processes, Wiley, New York, 1994.

  19. Sennott, L. I.: Stochastic Dynamic Programming and the Control of Queueing System, Wiley, New York, 1999.

    Google Scholar 

  20. Serfozo, R.: Optimal control of random walks, birth and death processes, and queues, Adv. Appl.Probab. 13 (1981), 61–83.

    Google Scholar 

  21. Song, J. S.: Continuous time Markov decision programming with non-uniformly bounded transition rates, Sci.Sin. 12 (1987), 1258–1267 (in Chinese).

    Google Scholar 

  22. Yin, G. G. and Zhang, Q.: Continuous-Time Markov Chains and Applications, Springer, New York, 1998.

    Google Scholar 

  23. Yushkevich, A. A. and Feinberg, E. A.: On homogeneous Markov model with continuous time and finite or countable state space, Theory Probab. Appl. 24 (1979), 156–161.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, X., Hernández-Lerma, O. Continuous-Time Controlled Markov Chains with Discounted Rewards. Acta Applicandae Mathematicae 79, 195–216 (2003). https://doi.org/10.1023/B:ACAP.0000003675.06200.45

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:ACAP.0000003675.06200.45

Navigation