Approximation of Infinite Horizon Discounted Cost Markov Decision Processes

Dufour, François; Prieto-Rumeau, Tomás

doi:10.1007/978-0-8176-8337-5_4

François Dufour^3,4 &
Tomás Prieto-Rumeau⁵

Part of the book series: Systems & Control: Foundations & Applications ((SCFA))

1447 Accesses
2 Citations

Abstract

We deal with a discrete-time infinite horizon Markov decision process with locally compact Borel state and action spaces and possibly unbounded cost function. Based on Lipschitz continuity of the elements of the control model, we propose a state and action discretization procedure for approximating the optimal value function and an optimal policy of the original control model. We provide explicit bounds on the approximation errors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Altman, E.: Constrained Markov Decision Processes. Chapman & Hall/CRC, Boca Raton FL (1999)
MATH Google Scholar
Arapostathis, A., Borkar, V.S., Fernández-Gaucherand, E., Ghosh, M.K., Marcus, S.I.: Discrete-time controlled Markov processes with average cost criterion: a survey. SIAM J. Control Optim. 31, 282–344 (1993)
Article MathSciNet MATH Google Scholar
Bertsekas, D.P.: Convergence of discretization procedures in dynamic programming. IEEE Trans. Automat. Control 20, 415–419 (1975)
Article MathSciNet MATH Google Scholar
Bertsekas, D.P., Shreve, S.E.: Stochastic Optimal Control: the Discrete Time Case. Academic Press, New York (1978)
MATH Google Scholar
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont MA (1996)
MATH Google Scholar
Chang, H.S., Fu, M.C., Hu, J.Q., Marcus, S.I.: Simulation-Based Algorithms for Markov Decision Processes. Springer, London (2007)
MATH Google Scholar
Dufour, F., Prieto-Rumeau, T.: Approximation of Markov decision processes with general state space. J. Math. Anal. Appl. 388, 1254–1267 (2012)
Article MathSciNet MATH Google Scholar
Filar, J., Vrieze, K.: Competitive Markov Decision Processes. Springer, New York (1997)
MATH Google Scholar
Hernández-Lerma, O.: Adaptive Markov Control Processes. Springer, New York (1989)
Book MATH Google Scholar
Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York (1996)
Book Google Scholar
Hernández-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Springer, New York (1999)
Book MATH Google Scholar
Hinderer, K.: On approximate solutions of finite-stage dynamic programs, in Dynamic Programming and Its Applications. Proc. Conf. Univ. British Columbia, Vancouver BC, 1977 (Academic Press, New York, 1978), pp. 289–317
Google Scholar
Hinderer, K.: Lipschitz continuity of value functions in Markovian decision processes. Math. Methods Oper. Res. 62, 3–22 (2005)
Article MathSciNet MATH Google Scholar
Langen, H.J.: Convergence of dynamic programming models. Math. Oper. Res. 6, 493–512 (1981)
Article MathSciNet MATH Google Scholar
Morin, T.L.: Computational advances in dynamic programming, in Dynamic Programming and Its Applications. Proc. Conf. Univ. British Columbia, Vancouver BC, 1977 (Academic Press, New York, 1978), pp. 53–90
Google Scholar
Powell, W.B.: Approximate Dynamic Programming. Wiley, Hoboken NJ (2007)
Book MATH Google Scholar
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
MATH Google Scholar
Sennott, L.I.: Stochastic Dynamic Programming and the Control of Queueing Systems. Wiley, New York (1999)
MATH Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction. MIT Press, Cambridge MA (1998)
Google Scholar
Van Roy, B.: Neuro-dynamic programming: overview and recent trends, in Handbook of Markov Decision Processes. Internat. Ser. Oper. Res. Management Sci (Kluwer, Boston MA, 2002), pp. 431–459
Google Scholar
Whitt, W.: Approximations of dynamic programs, I. Math. Oper. Res. 3, 231–243 (1978)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institut de Mathématiques de Bordeaux, Université Bordeaux I, Talence, 33405, France
François Dufour
INRIA Bordeaux Sud Ouest, Team CQFD, Bordeaux, 78153, France
François Dufour
Department of Statistics, UNED, Madrid, 28040, Spain
Tomás Prieto-Rumeau

Authors

François Dufour
View author publications
You can also search for this author in PubMed Google Scholar
Tomás Prieto-Rumeau
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomás Prieto-Rumeau .

Editor information

Editors and Affiliations

, Department of Probability and Statistics, Center for Research in Mathematics, Jalisco s/n, Guanajuato, 36000, Mexico
Daniel Hernández-Hernández
, Department of Mathematics, University of Sonora, Rosales s/n, Hermosillo, 83000, Sonora, Mexico
J. Adolfo Minjárez-Sosa

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Dufour, F., Prieto-Rumeau, T. (2012). Approximation of Infinite Horizon Discounted Cost Markov Decision Processes. In: Hernández-Hernández, D., Minjárez-Sosa, J. (eds) Optimization, Control, and Applications of Stochastic Systems. Systems & Control: Foundations & Applications. Birkhäuser, Boston. https://doi.org/10.1007/978-0-8176-8337-5_4

Download citation

DOI: https://doi.org/10.1007/978-0-8176-8337-5_4
Published: 12 July 2012
Publisher Name: Birkhäuser, Boston
Print ISBN: 978-0-8176-8336-8
Online ISBN: 978-0-8176-8337-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics