Skip to main content

Reachability in MDPs: Refining Convergence of Value Iteration

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8762))

Abstract

Markov Decision Processes (MDP) are a widely used model including both non-deterministic and probabilistic choices. Minimal and maximal probabilities to reach a target set of states, with respect to a policy resolving non-determinism, may be computed by several methods including value iteration. This algorithm, easy to implement and efficient in terms of space complexity, consists in iteratively finding the probabilities of paths of increasing length. However, it raises three issues: (1) defining a stopping criterion ensuring a bound on the approximation, (2) analyzing the rate of convergence, and (3) specifying an additional procedure to obtain the exact values once a sufficient number of iterations has been performed. The first two issues are still open and for the third one a “crude” upper bound on the number of iterations has been proposed. Based on a graph analysis and transformation of MDPs, we address these problems. First we introduce an interval iteration algorithm, for which the stopping criterion is straightforward. Then we exhibit convergence rate. Finally we significantly improve the bound on the number of iterations required to get the exact values.

The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under Grant Agreement n601148 (CASSTING).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baier, C., Katoen, J.-P.: Principles of Model Checking. MIT Press (2008)

    Google Scholar 

  2. Brázdil, T., Chatterjee, K., Chmelík, M., Forejt, V., Křetínský, J., Kwiatkowska, M., Parker, D., Ujma, M.: Verification of Markov decision processes using learning algorithms. Research Report arXiv:1402.2967 (2014)

    Google Scholar 

  3. Chatterjee, K., Henzinger, T.A.: Value iteration. In: Grumberg, O., Veith, H. (eds.) 25 Years of Model Checking. LNCS, vol. 5000, pp. 107–138. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  4. de Alfaro, L.: Formal Verification of Probabilistic Systems. PhD thesis, Stanford University (1997)

    Google Scholar 

  5. Forejt, V., Kwiatkowska, M., Norman, G., Parker, D.: Automated verification techniques for probabilistic systems. In: Bernardo, M., Issarny, V. (eds.) SFM 2011. LNCS, vol. 6659, pp. 53–113. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  6. Haddad, S., Monmege, B.: Reachability in MDPs: Refining convergence of value iteration. Technical Report LSV-14-07, LSV, ENS Cachan (2014), http://www.lsv.ens-cachan.fr/Publis/RAPPORTS_LSV/PDF/rr-lsv-2014-07.pdf

  7. Katoen, J.-P., Zapreev, I.S.: Safe on-the-fly steady-state detection for time-bounded reachability. In: QEST 2006, pp. 301–310 (2006)

    Google Scholar 

  8. Kattenbelt, M., Kwiatkowska, M.Z., Norman, G., Parker, D.: A game-based abstraction-refinement framework for Markov decision processes. Formal Methods in System Design 36(3), 246–280 (2010)

    Article  MATH  Google Scholar 

  9. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Haddad, S., Monmege, B. (2014). Reachability in MDPs: Refining Convergence of Value Iteration. In: Ouaknine, J., Potapov, I., Worrell, J. (eds) Reachability Problems. RP 2014. Lecture Notes in Computer Science, vol 8762. Springer, Cham. https://doi.org/10.1007/978-3-319-11439-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11439-2_10

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11438-5

  • Online ISBN: 978-3-319-11439-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics