Skip to main content
Log in

Discrete-time control with non-constant discount factor

  • Original Article
  • Published:
Mathematical Methods of Operations Research Aims and scope Submit manuscript

Abstract

This paper deals with discrete-time Markov decision processes (MDPs) with Borel state and action spaces, and total expected discounted cost optimality criterion. We assume that the discount factor is not constant: it may depend on the state and action; moreover, it can even take the extreme values zero or one. We propose sufficient conditions on the data of the model ensuring the existence of optimal control policies and allowing the characterization of the optimal value function as a solution to the dynamic programming equation. As a particular case of these MDPs with varying discount factor, we study MDPs with stopping, as well as the corresponding optimal stopping times and contact set. We show applications to switching MDPs models and, in particular, we study a pollution accumulation problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aliprantis CD, Border KC (2006) Infinite dimensional analysis. Springer, New York

    MATH  Google Scholar 

  • Bensoussan A (2011) Dynamic programming and inventory control. IOS Press, Amsterdam

    MATH  Google Scholar 

  • Bensoussan A, Lions JL (1982) Applications of variational inequalities in stochastic control. North-Holland, Amsterdam

    MATH  Google Scholar 

  • Bertsekas DP (1976) Dynamic programming and stochastic control. Academic Press, New York

    MATH  Google Scholar 

  • Dufour F, Piunovskiy A (2010) Multiobjective stopping problem for discrete-time Markov processes: conves analytic approach. J Appl Prob 47:947–966

    Article  Google Scholar 

  • Hernández-Lerma O, Lasserre JB (1996) Discrete-time Markov control processes: basic optimality criteria. Springer, New York

    Book  Google Scholar 

  • Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-time Markov control processes. Springer, New York

    Book  Google Scholar 

  • Hinderer K, Rieder U, Stieglitz M (2010) Dynamic optimization. Springer, New York

    MATH  Google Scholar 

  • Horiguchi M (2001a) Stopped Markov decision processes with a stopping time constraint. Math Methods Oper Res 53:279–295

    Article  MathSciNet  Google Scholar 

  • Horiguchi M (2001b) Stopped Markov decision processes with multiple constraints. Math Methods Oper Res 54:455–469

    Article  MathSciNet  Google Scholar 

  • Ilhuicatzi-Roldán R, Cruz-Suárez H, Chávez-Rodríguez S (2017) Markov decision processes with time-varying discount factors and random horizon. Kybernetica 53:82–98

    MathSciNet  MATH  Google Scholar 

  • Jasso-Fuentes H, López-Barrientos JD (2015) On the use of stochastic differential games against nature to ergodic control problems with unknown parameters. Int J Control 88:897–909

    MathSciNet  MATH  Google Scholar 

  • Jasso-Fuentes H, Yin G (2013) Advanced criteria for controlled Markov-modulated diffusions in an infinite horizon: overtaking, bias, and Blackwell optimality. Science Press, Beijing

    Google Scholar 

  • Jasso-Fuentes H, Menaldi JL, Prieto-Rumeau T, Robin M (2018) Discrete-time hybrid control in Borel spaces: average cost optimality criterion. J Math Anal Appl 462:1695–1713

    Article  MathSciNet  Google Scholar 

  • Jasso-Fuentes H, Menaldi JL, Prieto-Rumeau T (2020) Discrete-time hybrid control in Borel spaces. Appl Math Optim 81:409–441

    Article  MathSciNet  Google Scholar 

  • Kallenberg O (2002) Foundations of modern probability. Springer, New York

    Book  Google Scholar 

  • Menaldi JL, Blankenship GL (1984) Optimal stochastic scheduling of power generation systems with scheduling delays and large cost differentials. SIAM J Control Optim 22:121–132

    Article  MathSciNet  Google Scholar 

  • Meyn S, Tweedie RL (2009) Markov chains and Stochastic stability. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Minjárez-Sosa A (2015) Markov control models with unknown random state-action-dependent discount factors. TOP 23:743–772

    Article  MathSciNet  Google Scholar 

  • Morimoto H (2010) Stochastic control and mathematical modeling. Encyclopedia of mathematics and its applications. Cambridge University Press, New York

    Book  Google Scholar 

  • Puterman ML (1994) Markov decision processes: discrete stochastic dynamic programming. Wiley, New York

    Book  Google Scholar 

  • Rieder U (1975) On stopped decision processes with discrete time parameter. Stoch Process Appl 3:365–383

    Article  MathSciNet  Google Scholar 

  • Ross SM (1983) Introduction to stochastic dynamic programming. Academic Press, New York

    MATH  Google Scholar 

  • Wei Q, Guo XP (2011) Markov decision processes with state-dependent discount factors and unbounded rewards/costs. Oper Res Lett 39:369–374

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomás Prieto-Rumeau.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supported by Grant MTM2016-75497-P from the Spanish Ministerio de Economía y Competitividad.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jasso-Fuentes, H., Menaldi, JL. & Prieto-Rumeau, T. Discrete-time control with non-constant discount factor. Math Meth Oper Res 92, 377–399 (2020). https://doi.org/10.1007/s00186-020-00716-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00186-020-00716-8

Keywords

Mathematics Subject Classification

Navigation