Skip to main content
Log in

Continuous-time Markov decision processes with risk-sensitive finite-horizon cost criterion

  • Original Article
  • Published:
Mathematical Methods of Operations Research Aims and scope Submit manuscript

Abstract

This paper studies continuous-time Markov decision processes with a denumerable state space, a Borel action space, bounded cost rates and possibly unbounded transition rates under the risk-sensitive finite-horizon cost criterion. We give the suitable optimality conditions and establish the Feynman–Kac formula, via which the existence and uniqueness of the solution to the optimality equation and the existence of an optimal deterministic Markov policy are obtained. Moreover, employing a technique of the finite approximation and the optimality equation, we present an iteration method to compute approximately the optimal value and an optimal policy, and also give the corresponding error estimations. Finally, a controlled birth and death system is used to illustrate the main results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Cavazos-Cadena R, Hernández-Hernández D (2011) Discounted approximations for risk-sensitive average criteria in Markov decision chains with finite state space. Math Oper Res 36:133–146

    Article  MathSciNet  MATH  Google Scholar 

  • Confortola F, Fuhrman M (2014) Backward stochastic differential equations associated to jump Markov processes and applications. Stoch Process Appl 124:289–316

    Article  MathSciNet  MATH  Google Scholar 

  • Di Masi GB, Stettner L (2007) Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J Control Optim 46:231–252

    Article  MathSciNet  MATH  Google Scholar 

  • Ghosh MK, Saha S (2014) Risk-sensitive control of continuous time Markov chains. Stochastics 86:655–675

    MathSciNet  MATH  Google Scholar 

  • Guo XP, Hernández-Lerma O (2009) Continuous-time Markov decision processes: theory and applications. Springer, Berlin

  • Guo XP, Zhang WZ (2014) Convergence of controlled models and finite-state approximation for discounted continuous-time Markov decision processes with constraints. Eur J Oper Res 238:486–496

  • Guo XP, Huang XX, Huang YH (2015) Finite horizon optimality for continuous-time Markov decision processes with unbounded transition rates. Adv Appl Probab 47:1064–1087

  • Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-time Markov control processes. Springer, New York

    Book  MATH  Google Scholar 

  • Jaśkiewicz A (2007) Average optimality for risk-sensitive control with general state space. Ann Appl Probab 17:654–675

    Article  MathSciNet  MATH  Google Scholar 

  • Kitaev MY, Rykov VV (1995) Controlled queueing systems. CRC Press, Boca Raton

    MATH  Google Scholar 

  • Miller BL (1968) Finite state continuous time Markov decision processes with finite planning horizon. SIAM J Control 6:266–280

    Article  MathSciNet  MATH  Google Scholar 

  • Prieto-Rumeau T, Hernández-Lerma O (2012) Discounted continuous-time controlled Markov chains: convergence of control models. J Appl Probab 49:1072–1090

    MathSciNet  MATH  Google Scholar 

  • Prieto-Rumeau T, Lorenzo JM (2010) Approximating ergodic average reward continuous-time controlled Markov chains. IEEE Trans Autom Control 55:201–207

    Article  MathSciNet  Google Scholar 

  • Puterman ML (1994) Markov decision processes: discrete stochastic dynamic programming. Wiley, New York

    Book  MATH  Google Scholar 

  • Royden HL (1988) Real analysis. Macmillan, New York

    MATH  Google Scholar 

  • van Dijk NM (1988) On the finite horizon Bellman equation for controlled Markov jump models with unbounded characteristics: existence and approximation. Stoch Process Appl 28:141–157

    Article  MathSciNet  MATH  Google Scholar 

  • van Dijk NM (1989) A note on constructing \(\varepsilon \)-optimal policies for controlled Markov jump models with unbounded characteristics. Stochastics 27:51–58

    MathSciNet  MATH  Google Scholar 

  • Wei QD, Chen X (2014) Strong average optimality criterion for continuous-time Markov decision processes. Kybernetika 50:950–977

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

I am greatly indebted to the associate editor and the anonymous referees for many valuable comments and suggestions that have greatly improved the presentation. The research was supported by National Natural Science Foundation of China (Grant No. 11526092).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qingda Wei.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wei, Q. Continuous-time Markov decision processes with risk-sensitive finite-horizon cost criterion. Math Meth Oper Res 84, 461–487 (2016). https://doi.org/10.1007/s00186-016-0550-4

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00186-016-0550-4

Keywords

Mathematics Subject Classification

Navigation