Finite-State Approximation of Markov Decision Processes

Saldi, Naci; Linder, Tamás; Yüksel, Serdar

doi:10.1007/978-3-319-79033-6_4

Naci Saldi⁵,
Tamás Linder⁶ &
Serdar Yüksel⁷

Part of the book series: Systems & Control: Foundations & Applications ((SCFA))

849 Accesses

Abstract

In this chapter we study the finite-state approximation problem for computing near optimal policies for discrete-time MDPs with Borel state and action spaces, under discounted and average costs criteria. Even though existence and structural properties of optimal policies of MDPs have been studied extensively in the literature, computing such policies is generally a challenging problem for systems with uncountable state spaces. This situation also arises in the fully observed reduction of a partially observed Markov decision process even when the original system has finite state and action spaces. Here we show that one way to compute approximately optimal solutions for such MDPs is to construct a reduced model with a new transition probability and one-stage cost function by quantizing the state space, i.e., by discretizing it on a finite grid. It is reasonable to expect that when the one-stage cost function and the transition probability of the original model has certain continuity properties, the cost of the optimal policy for the approximating finite model converges to the optimal cost of the original model as the discretization becomes finer. Moreover, under additional continuity conditions on the transition probability and the one stage cost function we also obtain bounds on the accuracy of the approximation in terms of the number of points used to discretize the state space, thereby providing a tradeoff between the computation cost and the performance loss in the system. In particular, we study the following two problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

D.P. Bertsekas, Dynamic Programming and Optimal Control: Volume II (Athena Scientific, Belmont, 1995)
Google Scholar
D.P. Bertsekas, S.E. Shreve, Stochastic Optimal Control: The Discrete Time Case (Academic, New York, 1978)
Google Scholar
D. Blackwell, D. Freedman, M. Orkin, The optimal reward operator in dynamic programming. Ann. Probab. 2(2), 926–941 (1974)
Article MathSciNet Google Scholar
V. Borkar, Convex analytic methods in Markov decision processes, in Handbook of Markov Decision Processes ed. by E.A. Feinberg, A. Shwartz (Kluwer Academic, Boston, 2002)
Google Scholar
S.B. Connor, G. Fort, State-dependent Foster-Lyapunov criteria for subgeometric convergence of Markov chains. Stoch. Process Appl. 119, 176–4193 (2009)
Article MathSciNet Google Scholar
T.M. Cover, J.A. Thomas, Elements of Information Theory, 2nd edn. (Wiley, New York, 2006)
Google Scholar
R. Douc, G. Fort, E. Moulines, P. Soulier, Practical drift conditions for subgeometric rates of convergence. Ann. Appl. Probab 14, 1353–1377 (2004)
Google Scholar
F. Dufour, T. Prieto-Rumeau, Approximation of Markov decision processes with general state space. J. Math. Anal. Appl. 388, 1254–1267 (2012)
Article MathSciNet Google Scholar
F. Dufour, T. Prieto-Rumeau, Finite linear programming approximations of constrained discounted Markov decision processes. SIAM J. Control Optim. 51(2), 1298–1324 (2013)
Article MathSciNet Google Scholar
F. Dufour, T. Prieto-Rumeau, Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities. Stochastics 87, 273–307 (2014)
Article MathSciNet Google Scholar
E. Gordienko, O. Hernandez-Lerma, Average cost Markov control processes with weighted norms: existence of canonical policies. Appl. Math. 23(2), 199–218 (1995)
Article MathSciNet Google Scholar
R.M. Gray, D.L. Neuhoff, Quantization. IEEE Trans. Inf. Theory 44, 2325–2383 (1998)
Google Scholar
M. Hairer, Convergence of Markov Processes. Lecture Notes (University of Warwick, 2010)
Google Scholar
O. Hernández-Lerma, Adaptive Markov Control Processes (Springer, New York, 1989)
Chapter Google Scholar
O. Hernández-Lerma, J.B. Lasserre, Discrete-Time Markov Control Processes: Basic Optimality Criteria (Springer, Berlin, 1996)
Chapter Google Scholar
O. Hernández-Lerma, J.B. Lasserre, Further Topics on Discrete-Time Markov Control Processes (Springer, New York, 1999)
Book Google Scholar
K. Hinderer, Lipshitz continuity of value functions in Markovian desision processes. Math. Methods Oper. Res. 62, 3–22 (2005)
Article MathSciNet Google Scholar
A. Jaśkiewicz, A.S. Nowak, On the optimality equation for average cost Markov control processes with Feller transition probabilities. J. Math. Anal. Appl. 316, 495–509 (2006)
Article MathSciNet Google Scholar
K. Kuratowski, Topology: Volume I (Academic, New York, 1966)
Chapter Google Scholar
S.P. Meyn, R.L. Tweedie, Markov Chains and Stochastic Stability (Springer, New York, 1993)
Book Google Scholar
R. Ortner, Pseudometrics for state aggregation in average reward Markov decision processes, in Algorithmic Learning Theory (Springer, Berlin, 2007)
MATH Google Scholar
M.L. Puterman, Markov Decision Processes (Wiley, Hoboken, NJ, 2005)
Google Scholar
G.O. Roberts, J.S. Rosenthal, General state space Markov chains and MCMC algorithms. Probab. Surv. 1, 20–71 (2004)
Article MathSciNet Google Scholar
B.V. Roy, Performance loss bounds for approximate value iteration with state aggregation. Math. Oper. Res. 31(2), 234–244 (2006)
Google Scholar
N. Saldi, T. Linder, S. Yüksel, Finite state approximations of Markov decision processes with general state and action spaces, in American Control Conference, Chicago (2015)
Google Scholar
N. Saldi, S. Yüksel, T. Linder, Finite-state approximation of Markov decision processes with unbounded costs and Borel spaces, in IEEE Conference Decision Control, Japan (2015)
Google Scholar
N. Saldi, S. Yüksel, T. Linder, Asymptotic optimality of finite approximations to Markov decision processes with Borel spaces. Math. Oper. Res. 42(4), 945–978 (2017)
Article MathSciNet Google Scholar
S.E. Shreve, D.P. Bertsekas, Universally measurable policies in dynamic programming. Math. Oper. Res. 4(1), 15–30 (1979)
Article MathSciNet Google Scholar
P. Tuominen, R.L. Tweedie, Subgeometric rates of convergence of f-ergodic Markov chains. Adv. Ann. Appl. Probab. 26(3), 775–798 (1994)
Article MathSciNet Google Scholar
O. Vega-Amaya, The average cost optimality equation: a fixed point approach. Bol. Soc. Mat. Mex. 9(3), 185–195 (2003)
MathSciNet MATH Google Scholar
C. Villani, Optimal Transport: Old and New (Springer, Berlin, 2009)
Book Google Scholar
Y. Yamada, S. Tazaki, R.M. Gray, Asymptotic performance of block quantizers with difference distortion measures. IEEE Trans. Inf. Theory 26, 6–14 (1980)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Ozyegin University, Istanbul, Turkey
Naci Saldi
Queen’s University, Kingston, Canada
Tamás Linder
Queen’s University, Kingston, Canada
Serdar Yüksel

Authors

Naci Saldi
View author publications
You can also search for this author in PubMed Google Scholar
Tamás Linder
View author publications
You can also search for this author in PubMed Google Scholar
Serdar Yüksel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Naci Saldi , Tamás Linder or Serdar Yüksel .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Saldi, N., Linder, T., Yüksel, S. (2018). Finite-State Approximation of Markov Decision Processes. In: Finite Approximations in Discrete-Time Stochastic Control. Systems & Control: Foundations & Applications. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-79033-6_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-79033-6_4
Published: 12 May 2018
Publisher Name: Birkhäuser, Cham
Print ISBN: 978-3-319-79032-9
Online ISBN: 978-3-319-79033-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics