Advertisement

Discrete Event Dynamic Systems

, Volume 29, Issue 4, pp 445–471 | Cite as

Risk-sensitive continuous-time Markov decision processes with unbounded rates and Borel spaces

  • Xianping Guo
  • Junyu ZhangEmail author
Article

Abstract

This paper considers the finite-horizon risk-sensitive optimality for continuous-time Markov decision processes, and focuses on the more general case that the transition rates are unbounded, cost/reward rates are allowed to be unbounded from below and from above, the policies can be history-dependent, and the state and action spaces are Borel ones. Under mild conditions imposed on the decision process’s primitive data, we establish the existence of a solution to the corresponding optimality equation (OE) by a so called approximation technique. Then, using the OE and the extension of Dynkin’s formula developed here, we prove the existence of an optimal Markov policy, and verify that the value function is the unique solution to the OE. Finally, we give an example to illustrate the difference between our conditions and those in the previous literature.

Keywords

Continuous-time Markov decision process Finite horizon risk-sensitive criterion History-dependent policy Unbounded transition/cost rates Optimal policy 

Notes

Acknowledgments

Research supported by the National Natural Science Foundation of China (Grant No. 60171009, Grant No. 61673019, Grant No. 11931018).

References

  1. Anantharam V, Borkar VS (2017) A variational formula for risk-sensitive reward. SIAM J Control Optim 55:961–988MathSciNetCrossRefGoogle Scholar
  2. Basu A, Ghosh MK (2014) Zero-sum risk-sensitive stochastic games on a countable state space. Stochastic Process Appl 124:961–983MathSciNetCrossRefGoogle Scholar
  3. Baüerle N, Rieder U (2014) More risk-sensitive Markov decision processes. Math Oper Res 39:105–120MathSciNetCrossRefGoogle Scholar
  4. Baüerle N, Rieder U (2017) Zero-sum risk-sensitive stochastic games. Stochastic Process Appl 127:622–642MathSciNetCrossRefGoogle Scholar
  5. Bertsekas D, Shreve S (1996) Stochastic optimal control: the discrete-time case. Academic Press, IncGoogle Scholar
  6. Cavazos-Cadena R, Hernndez-Hernndez D (2011) Discounted approximations for risk-sensitive average criteria in Markov decision chains with finite state space. Math Oper Res 36:133–146MathSciNetCrossRefGoogle Scholar
  7. Ghosh MK, Saha S (2014) Risk-sensitive control of continuous time Markov chains. Stochastics 86:655–675MathSciNetCrossRefGoogle Scholar
  8. Feinberg EA, Mandava M, Shiryaev AN (2014) On solutions of Kolmogorov’s equations for nonhomogeneous jump Markov processes. J Math Anal Appl 411:261–270MathSciNetCrossRefGoogle Scholar
  9. Guo X (2007) Continuous–time Markov decision processes with discounted rewards: the case of Polish spaces. Math Oper Res 32:73–87MathSciNetCrossRefGoogle Scholar
  10. Guo X, Piunovskiy A (2011) Discounted continuous-time Markov decision processes with constraints: unbounded transition and loss rates. Math Oper Res 36:105–132MathSciNetCrossRefGoogle Scholar
  11. Guo X, Song X (2011) Discounted continuous-time constrained Markov decision processes in Polish spaces. Ann Appl Probab 21:2016–2049MathSciNetCrossRefGoogle Scholar
  12. Guo X, Hernández-Lerma O (2009) Continuous-time Markov decision processes: theory and applications. Springer, BerlinCrossRefGoogle Scholar
  13. Guo X, Huang Y, Song X (2012) Linear programming and constrained average optimality for general continuous-time Markov decision processes in history-dependent policies. SIAM J Control Optim 50:23–47MathSciNetCrossRefGoogle Scholar
  14. Guo X, Huang XX, Zhang Y (2015a) On the first passage g-mean-variance optimality for discounted continuous-time Markov decision processes. SIAM J Control Optim 53:1406–1424MathSciNetCrossRefGoogle Scholar
  15. Guo X, Huang XX, Huang Y (2015b) Finite-horizon optimality for continuous-time Markov decision processes with unbounded transition rates. Adv Appl Probab 47:1064–1087MathSciNetCrossRefGoogle Scholar
  16. Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-time Markov control processes. Springer, New YorkCrossRefGoogle Scholar
  17. Huang Y (2018) Finite horizon continuous-time Markov decision processes with mean and variance criteria. Discret Event Dyn Syst 28(4):539–564MathSciNetCrossRefGoogle Scholar
  18. Huo H, Zou X, Guo X (2017) The risk probability criterion for discounted continuous-time Markov decision processes. Discret Event Dyn Syst 27(4):675–699MathSciNetCrossRefGoogle Scholar
  19. Jaskiewicz A (2007) Average optimality for risk-sensitive control with general state space. Ann Appl Probab 17:654–675MathSciNetCrossRefGoogle Scholar
  20. Kitaev MY, Rykov V (1995) Controlled queueing systems. CRC Press, New YorkzbMATHGoogle Scholar
  21. Kumar KS, Chandan P (2013) Risk-sensitive control of jump process on denumerable state space with near monotone cost. Appl Math Optim 68:311–331MathSciNetCrossRefGoogle Scholar
  22. Kumar KS, Chandan P (2015) Risk-sensitive control of continuous-time Markov processes with denumerable state space. Stoch Anal Appl 33:863–881MathSciNetCrossRefGoogle Scholar
  23. Piunovskiy A, Zhang Y (2011) Discounted continuous-time Markov decision processes with unbounded rates: the convex analytic approach. SIAM J Control Optim 49:2032–2061MathSciNetCrossRefGoogle Scholar
  24. Prieto-Rumeau T, Hernández-Lerma O (2012) Selected topics in continuous-time controlled Markov chains and Markov games. Imperial College Press, LondonCrossRefGoogle Scholar
  25. Wei QD (2016) Continuous-time Markov decision processes with risk-sensitive finite-horizon cost criterion. Math Meth Oper Res 84:461–487MathSciNetCrossRefGoogle Scholar
  26. Wei Q, Chen X (2017) Average cost criterion induced by the regular utility function for continuous-time Markov decision processes. Discret Event Dyn Syst 27 (3):501–524MathSciNetCrossRefGoogle Scholar
  27. Xia L (2014) Event-based optimization of admission control in open queueing networks. Discret Event Dyn Syst 24(2):133–151MathSciNetCrossRefGoogle Scholar
  28. Xia L (2018) Variance minimization of parameterized Markov decision processes. Discret Event Dyn Syst 28:63–81MathSciNetCrossRefGoogle Scholar
  29. Yushkevich AA (1977) Controlled Markov models with countable state and continuous time. Theory Probab Appl 22:215–235MathSciNetCrossRefGoogle Scholar
  30. Zhang Y (2017) Continuous-time Markov decision processes with exponential utility. SIAM J Control Optim 55:2636–2660MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of MathematicsSun Yat-Sen UniversityGuangzhouChina
  2. 2.Guangdong Province Key Laboratory of Computational ScienceGuangzhouChina

Personalised recommendations