Skip to main content
Log in

Continuous-Time Markov Decision Processes with State-Dependent Discount Factors

  • Published:
Acta Applicandae Mathematicae Aims and scope Submit manuscript

Abstract

We consider continuous-time Markov decision processes in Polish spaces. The performance of a control policy is measured by the expected discounted reward criterion associated with state-dependent discount factors. All underlying Markov processes are determined by the given transition rates which are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. By using the dynamic programming approach, we establish the discounted reward optimality equation (DROE) and the existence and uniqueness of its solutions. Under suitable conditions, we also obtain a discounted optimal stationary policy which is optimal in the class of all randomized stationary policies. Moreover, when the transition rates are uniformly bounded, we provide an algorithm to compute (or at least to approximate) the discounted reward optimal value function as well as a discounted optimal stationary policy. Finally, we use an example to illustrate our results. Specially, we first derive an explicit and exact solution to the DROE and an explicit expression of a discounted optimal stationary policy for such an example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Carmon, Y., Shwartz, A.: Markov decision processes with exponentially representable discounting. Oper. Res. Lett. 37(1), 51–55 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  2. Doshi, B.T.: Continuous-time control of Markov processes on an arbitrary state space: discounted rewards. Ann. Stat. 4, 1219–1235 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  3. Dynkin, E.B., Yushkevich, A.A.: Controlled Markov Processes. Springer, New York (1979)

    Book  Google Scholar 

  4. Feinberg, E.A.: Continuous-time jump Markov decision processes: A discrete-event approach. Math. Oper. Res. 29, 492–524 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  5. Feinberg, E.A., Shwartz, A.: Markov decision models with weighted discounted criteria. Math. Oper. Res. 19(1), 152–168 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  6. Feinberg, E.A., Shwartz, A.: Constrained dynamic programming with two discount factors: Applications and algorithm. IEEE Trans. Autom. Control 44(3), 628–631 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  7. Feller, W.: On the integro-differential equations of purely discontinuous Markoff processes. Trans. Am. Math. Soc. 48, 488–515 (1940)

    Article  MathSciNet  Google Scholar 

  8. Gihman, I.I., Skorohod, A.V.: The Theory of Stochastic Processes II. Springer, Berlin (2004) (This is a reprint of the First edition 1975)

    Google Scholar 

  9. González-Hernández, J., López-Martínez, R.R., Pérez-Hernández, J.R.: Markov control processes with randomized discounted cost. Math. Methods Oper. Res. 65, 27–44 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  10. Guo, X.P.: Continuous-time Markov decision processes with discounted rewards: The case of Polish spaces. Math. Oper. Res. 32(1), 73–87 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  11. Guo, X.P.: Constrained optimization for average cost continuous-time Markov decision processes. IEEE Trans. Autom. Control 52(6), 1139–1143 (2007)

    Article  Google Scholar 

  12. Guo, X.P., Hernández-Lerma, O.: Continuous-time controlled Markov chains. Ann. Appl. Probab. 13, 363–388 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  13. Guo, X.P., Hernández-Lerma, O.: Continuous-time Markov Decision Processes: Theory and Applications. Springer, Berlin (2009)

    Book  MATH  Google Scholar 

  14. Guo, X.P., Song, X.Y.: Discounted continuous-time constrained Markov decision processes in Polish spaces. Ann. Appl. Probab. 21(5), 2016–2049 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  15. Guo, X.P., Ye, L.E.: New discount and average optimality conditions for continuous-time Markov decision processes. Adv. Appl. Probab. 42, 953–985 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  16. Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes. Springer, New York (1996)

    Book  Google Scholar 

  17. Hernández-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Springer, New York (1999)

    Book  MATH  Google Scholar 

  18. Hinderer, K.: Foundations of Non Stationary Dynamic Programming with Discrete Time Parameter. Lecture Notes in Operations Research, vol. 33. Springer, New York (1970)

    Book  MATH  Google Scholar 

  19. Kakumanu, P.: Continuously discounted Markov decision models with countable state and action spaces. Ann. Math. Stat. 42, 919–926 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  20. Puterman, M.L.: Markov Decision Processes. Wiley, New York (1994)

    Book  MATH  Google Scholar 

  21. Schäl, M.: Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal. Z. Wahrscheinlichkeitstheor. Verw. Geb. 32(3), 179–196 (1975)

    Article  MATH  Google Scholar 

  22. Ye, L.E., Guo, X.P.: (2012) Construction and regularity of transition functions on Polish spaces under measurability conditions. Acta Math. Appl. Sin. (accepted)

  23. Ye, L.E., Guo, X.P., Hernández-Lerma, O.: Existence and regularity of a nonhomogeneous transition matrix under measurability conditions. J. Theor. Probab. 21, 604–627 (2008)

    Article  MATH  Google Scholar 

Download references

Acknowledgements

Research supported by NSFC, GDUPS, and GPK-LCS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liuer Ye.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ye, L., Guo, X. Continuous-Time Markov Decision Processes with State-Dependent Discount Factors. Acta Appl Math 121, 5–27 (2012). https://doi.org/10.1007/s10440-012-9669-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10440-012-9669-3

Keywords

Mathematics Subject Classification (2000)

Navigation