Reachability in MDPs: Refining Convergence of Value Iteration

Haddad, Serge; Monmege, Benjamin

doi:10.1007/978-3-319-11439-2_10

Reachability in MDPs: Refining Convergence of Value Iteration

Serge Haddad¹⁸ &
Benjamin Monmege¹⁹

Conference paper

656 Accesses
30 Citations
7 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8762))

Abstract

Markov Decision Processes (MDP) are a widely used model including both non-deterministic and probabilistic choices. Minimal and maximal probabilities to reach a target set of states, with respect to a policy resolving non-determinism, may be computed by several methods including value iteration. This algorithm, easy to implement and efficient in terms of space complexity, consists in iteratively finding the probabilities of paths of increasing length. However, it raises three issues: (1) defining a stopping criterion ensuring a bound on the approximation, (2) analyzing the rate of convergence, and (3) specifying an additional procedure to obtain the exact values once a sufficient number of iterations has been performed. The first two issues are still open and for the third one a “crude” upper bound on the number of iterations has been proposed. Based on a graph analysis and transformation of MDPs, we address these problems. First we introduce an interval iteration algorithm, for which the stopping criterion is straightforward. Then we exhibit convergence rate. Finally we significantly improve the bound on the number of iterations required to get the exact values.

The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under Grant Agreement n601148 (CASSTING).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baier, C., Katoen, J.-P.: Principles of Model Checking. MIT Press (2008)
Google Scholar
Brázdil, T., Chatterjee, K., Chmelík, M., Forejt, V., Křetínský, J., Kwiatkowska, M., Parker, D., Ujma, M.: Verification of Markov decision processes using learning algorithms. Research Report arXiv:1402.2967 (2014)
Google Scholar
Chatterjee, K., Henzinger, T.A.: Value iteration. In: Grumberg, O., Veith, H. (eds.) 25 Years of Model Checking. LNCS, vol. 5000, pp. 107–138. Springer, Heidelberg (2008)
Chapter Google Scholar
de Alfaro, L.: Formal Verification of Probabilistic Systems. PhD thesis, Stanford University (1997)
Google Scholar
Forejt, V., Kwiatkowska, M., Norman, G., Parker, D.: Automated verification techniques for probabilistic systems. In: Bernardo, M., Issarny, V. (eds.) SFM 2011. LNCS, vol. 6659, pp. 53–113. Springer, Heidelberg (2011)
Chapter Google Scholar
Haddad, S., Monmege, B.: Reachability in MDPs: Refining convergence of value iteration. Technical Report LSV-14-07, LSV, ENS Cachan (2014), http://www.lsv.ens-cachan.fr/Publis/RAPPORTS_LSV/PDF/rr-lsv-2014-07.pdf
Katoen, J.-P., Zapreev, I.S.: Safe on-the-fly steady-state detection for time-bounded reachability. In: QEST 2006, pp. 301–310 (2006)
Google Scholar
Kattenbelt, M., Kwiatkowska, M.Z., Norman, G., Parker, D.: A game-based abstraction-refinement framework for Markov decision processes. Formal Methods in System Design 36(3), 246–280 (2010)
Article MATH Google Scholar
Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

LSV, ENS Cachan, CNRS & Inria, France
Serge Haddad
Université libre de Bruxelles, Belgium
Benjamin Monmege

Authors

Serge Haddad
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Monmege
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Oxford, Wolfson Building, Parks Road, OX1 3QD, Oxford, UK
Joël Ouaknine
Department of Computer Science, University of Liverpool, Liverpool, UK
Igor Potapov
Department of Computer Science, University of Oxford, UK
James Worrell

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Haddad, S., Monmege, B. (2014). Reachability in MDPs: Refining Convergence of Value Iteration. In: Ouaknine, J., Potapov, I., Worrell, J. (eds) Reachability Problems. RP 2014. Lecture Notes in Computer Science, vol 8762. Springer, Cham. https://doi.org/10.1007/978-3-319-11439-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-11439-2_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11438-5
Online ISBN: 978-3-319-11439-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics