Continuous-Time Markov Decision Processes with State-Dependent Discount Factors

Ye, Liuer; Guo, Xianping

doi:10.1007/s10440-012-9669-3

Continuous-Time Markov Decision Processes with State-Dependent Discount Factors

Published: 24 February 2012

Volume 121, pages 5–27, (2012)
Cite this article

Acta Applicandae Mathematicae Aims and scope Submit manuscript

Liuer Ye¹ &
Xianping Guo²

598 Accesses
6 Citations
Explore all metrics

Abstract

We consider continuous-time Markov decision processes in Polish spaces. The performance of a control policy is measured by the expected discounted reward criterion associated with state-dependent discount factors. All underlying Markov processes are determined by the given transition rates which are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. By using the dynamic programming approach, we establish the discounted reward optimality equation (DROE) and the existence and uniqueness of its solutions. Under suitable conditions, we also obtain a discounted optimal stationary policy which is optimal in the class of all randomized stationary policies. Moreover, when the transition rates are uniformly bounded, we provide an algorithm to compute (or at least to approximate) the discounted reward optimal value function as well as a discounted optimal stationary policy. Finally, we use an example to illustrate our results. Specially, we first derive an explicit and exact solution to the DROE and an explicit expression of a discounted optimal stationary policy for such an example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Carmon, Y., Shwartz, A.: Markov decision processes with exponentially representable discounting. Oper. Res. Lett. 37(1), 51–55 (2009)
Article MathSciNet MATH Google Scholar
Doshi, B.T.: Continuous-time control of Markov processes on an arbitrary state space: discounted rewards. Ann. Stat. 4, 1219–1235 (1976)
Article MathSciNet MATH Google Scholar
Dynkin, E.B., Yushkevich, A.A.: Controlled Markov Processes. Springer, New York (1979)
Book Google Scholar
Feinberg, E.A.: Continuous-time jump Markov decision processes: A discrete-event approach. Math. Oper. Res. 29, 492–524 (2004)
Article MathSciNet MATH Google Scholar
Feinberg, E.A., Shwartz, A.: Markov decision models with weighted discounted criteria. Math. Oper. Res. 19(1), 152–168 (1994)
Article MathSciNet MATH Google Scholar
Feinberg, E.A., Shwartz, A.: Constrained dynamic programming with two discount factors: Applications and algorithm. IEEE Trans. Autom. Control 44(3), 628–631 (1999)
Article MathSciNet MATH Google Scholar
Feller, W.: On the integro-differential equations of purely discontinuous Markoff processes. Trans. Am. Math. Soc. 48, 488–515 (1940)
Article MathSciNet Google Scholar
Gihman, I.I., Skorohod, A.V.: The Theory of Stochastic Processes II. Springer, Berlin (2004) (This is a reprint of the First edition 1975)
Google Scholar
González-Hernández, J., López-Martínez, R.R., Pérez-Hernández, J.R.: Markov control processes with randomized discounted cost. Math. Methods Oper. Res. 65, 27–44 (2007)
Article MathSciNet MATH Google Scholar
Guo, X.P.: Continuous-time Markov decision processes with discounted rewards: The case of Polish spaces. Math. Oper. Res. 32(1), 73–87 (2007)
Article MathSciNet MATH Google Scholar
Guo, X.P.: Constrained optimization for average cost continuous-time Markov decision processes. IEEE Trans. Autom. Control 52(6), 1139–1143 (2007)
Article Google Scholar
Guo, X.P., Hernández-Lerma, O.: Continuous-time controlled Markov chains. Ann. Appl. Probab. 13, 363–388 (2003)
Article MathSciNet MATH Google Scholar
Guo, X.P., Hernández-Lerma, O.: Continuous-time Markov Decision Processes: Theory and Applications. Springer, Berlin (2009)
Book MATH Google Scholar
Guo, X.P., Song, X.Y.: Discounted continuous-time constrained Markov decision processes in Polish spaces. Ann. Appl. Probab. 21(5), 2016–2049 (2011)
Article MathSciNet MATH Google Scholar
Guo, X.P., Ye, L.E.: New discount and average optimality conditions for continuous-time Markov decision processes. Adv. Appl. Probab. 42, 953–985 (2010)
Article MathSciNet MATH Google Scholar
Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes. Springer, New York (1996)
Book Google Scholar
Hernández-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Springer, New York (1999)
Book MATH Google Scholar
Hinderer, K.: Foundations of Non Stationary Dynamic Programming with Discrete Time Parameter. Lecture Notes in Operations Research, vol. 33. Springer, New York (1970)
Book MATH Google Scholar
Kakumanu, P.: Continuously discounted Markov decision models with countable state and action spaces. Ann. Math. Stat. 42, 919–926 (1971)
Article MathSciNet MATH Google Scholar
Puterman, M.L.: Markov Decision Processes. Wiley, New York (1994)
Book MATH Google Scholar
Schäl, M.: Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal. Z. Wahrscheinlichkeitstheor. Verw. Geb. 32(3), 179–196 (1975)
Article MATH Google Scholar
Ye, L.E., Guo, X.P.: (2012) Construction and regularity of transition functions on Polish spaces under measurability conditions. Acta Math. Appl. Sin. (accepted)
Ye, L.E., Guo, X.P., Hernández-Lerma, O.: Existence and regularity of a nonhomogeneous transition matrix under measurability conditions. J. Theor. Probab. 21, 604–627 (2008)
Article MATH Google Scholar

Download references

Acknowledgements

Research supported by NSFC, GDUPS, and GPK-LCS.

Author information

Authors and Affiliations

Department of Statistics, College of Economics, Jinan University, Guangzhou, 510632, P.R. China
Liuer Ye
School of Mathematics and Computational Science, Sun-Yat Sen University, Guangzhou, 510275, P.R. China
Xianping Guo

Authors

Liuer Ye
View author publications
You can also search for this author in PubMed Google Scholar
Xianping Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liuer Ye.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ye, L., Guo, X. Continuous-Time Markov Decision Processes with State-Dependent Discount Factors. Acta Appl Math 121, 5–27 (2012). https://doi.org/10.1007/s10440-012-9669-3

Download citation

Received: 07 July 2010
Accepted: 09 January 2012
Published: 24 February 2012
Issue Date: October 2012
DOI: https://doi.org/10.1007/s10440-012-9669-3

Keywords

Mathematics Subject Classification (2000)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Continuous-Time Markov Decision Processes with State-Dependent Discount Factors

Abstract

Access this article

Similar content being viewed by others

Strong n-discount and finite-horizon optimality for continuous-time Markov decision processes

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2000)

Navigation

Continuous-Time Markov Decision Processes with State-Dependent Discount Factors

Abstract

Access this article

Similar content being viewed by others

Strong n-discount and finite-horizon optimality for continuous-time Markov decision processes

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2000)

Search

Navigation