Abstract
We deal with a discrete-time infinite horizon Markov decision process with locally compact Borel state and action spaces and possibly unbounded cost function. Based on Lipschitz continuity of the elements of the control model, we propose a state and action discretization procedure for approximating the optimal value function and an optimal policy of the original control model. We provide explicit bounds on the approximation errors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Altman, E.: Constrained Markov Decision Processes. Chapman & Hall/CRC, Boca Raton FL (1999)
Arapostathis, A., Borkar, V.S., Fernández-Gaucherand, E., Ghosh, M.K., Marcus, S.I.: Discrete-time controlled Markov processes with average cost criterion: a survey. SIAM J. Control Optim. 31, 282–344 (1993)
Bertsekas, D.P.: Convergence of discretization procedures in dynamic programming. IEEE Trans. Automat. Control 20, 415–419 (1975)
Bertsekas, D.P., Shreve, S.E.: Stochastic Optimal Control: the Discrete Time Case. Academic Press, New York (1978)
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont MA (1996)
Chang, H.S., Fu, M.C., Hu, J.Q., Marcus, S.I.: Simulation-Based Algorithms for Markov Decision Processes. Springer, London (2007)
Dufour, F., Prieto-Rumeau, T.: Approximation of Markov decision processes with general state space. J. Math. Anal. Appl. 388, 1254–1267 (2012)
Filar, J., Vrieze, K.: Competitive Markov Decision Processes. Springer, New York (1997)
Hernández-Lerma, O.: Adaptive Markov Control Processes. Springer, New York (1989)
Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York (1996)
Hernández-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Springer, New York (1999)
Hinderer, K.: On approximate solutions of finite-stage dynamic programs, in Dynamic Programming and Its Applications. Proc. Conf. Univ. British Columbia, Vancouver BC, 1977 (Academic Press, New York, 1978), pp. 289–317
Hinderer, K.: Lipschitz continuity of value functions in Markovian decision processes. Math. Methods Oper. Res. 62, 3–22 (2005)
Langen, H.J.: Convergence of dynamic programming models. Math. Oper. Res. 6, 493–512 (1981)
Morin, T.L.: Computational advances in dynamic programming, in Dynamic Programming and Its Applications. Proc. Conf. Univ. British Columbia, Vancouver BC, 1977 (Academic Press, New York, 1978), pp. 53–90
Powell, W.B.: Approximate Dynamic Programming. Wiley, Hoboken NJ (2007)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
Sennott, L.I.: Stochastic Dynamic Programming and the Control of Queueing Systems. Wiley, New York (1999)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction. MIT Press, Cambridge MA (1998)
Van Roy, B.: Neuro-dynamic programming: overview and recent trends, in Handbook of Markov Decision Processes. Internat. Ser. Oper. Res. Management Sci (Kluwer, Boston MA, 2002), pp. 431–459
Whitt, W.: Approximations of dynamic programs, I. Math. Oper. Res. 3, 231–243 (1978)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Dufour, F., Prieto-Rumeau, T. (2012). Approximation of Infinite Horizon Discounted Cost Markov Decision Processes. In: Hernández-Hernández, D., Minjárez-Sosa, J. (eds) Optimization, Control, and Applications of Stochastic Systems. Systems & Control: Foundations & Applications. Birkhäuser, Boston. https://doi.org/10.1007/978-0-8176-8337-5_4
Download citation
DOI: https://doi.org/10.1007/978-0-8176-8337-5_4
Published:
Publisher Name: Birkhäuser, Boston
Print ISBN: 978-0-8176-8336-8
Online ISBN: 978-0-8176-8337-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)