Approximating the Termination Value of One-Counter MDPs and Stochastic Games

Brázdil, Tomáš; Brožek, Václav; Etessami, Kousha; Kučera, Antonín

doi:10.1007/978-3-642-22012-8_26

Tomáš Brázdil¹⁹,
Václav Brožek²⁰,
Kousha Etessami²⁰ &
…
Antonín Kučera¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6756))

Included in the following conference series:

International Colloquium on Automata, Languages, and Programming

1453 Accesses
8 Citations

Abstract

One-counter MDPs (OC-MDPs) and one-counter simple stochastic games (OC-SSGs) are 1-player, and 2-player turn-based zero-sum, stochastic games played on the transition graph of classic one-counter automata (equivalently, pushdown automata with a 1-letter stack alphabet). A key objective for the analysis and verification of these games is the termination objective, where the players aim to maximize (minimize, respectively) the probability of hitting counter value 0, starting at a given control state and given counter value.

Recently [4,2], we studied qualitative decision problems (“is the optimal termination value = 1?”) for OC-MDPs (and OC-SSGs) and showed them to be decidable in P-time (in NP∩coNP, respectively). However, quantitative decision and approximation problems (“is the optimal termination value ≥ p”, or “approximate the termination value within ε”) are far more challenging. This is so in part because optimal strategies may not exist, and because even when they do exist they can have a highly non-trivial structure. It thus remained open even whether any of these quantitative termination problems are computable.

In this paper we show that all quantitative approximation problems for the termination value for OC-MDPs and OC-SSGs are computable. Specifically, given a OC-SSG, and given ε > 0, we can compute a value v that approximates the value of the OC-SSG termination game within additive error ε, and furthermore we can compute ε-optimal strategies for both players in the game.

A key ingredient in our proofs is a subtle martingale, derived from solving certain LPs that we can associate with a maximizing OC-MDP. An application of Azuma’s inequality on these martingales yields a computable bound for the “wealth” at which a “rich person’s strategy” becomes ε-optimal for OC-MDPs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berger, N., Kapur, N., Schulman, L.J., Vazirani, V.: Solvency Games. In: Proc. of FSTTCS 2008 (2008)
Google Scholar
Brázdil, T., Brožek, V., Etessami, K.: One-Counter Simple Stochastic Games. In: Proc. of FSTTCS 2010, pp. 108–119 (2010)
Google Scholar
Brázdil, T., Brožek, V., Etessami, K., Kučera, A.: Approximating the Termination Value of One-Counter MDPs and Stochastic Games. Tech. Rep. abs/1104.4978, CoRR (2011), http://arxiv.org/abs/1104.4978
Brázdil, T., Brožek, V., Etessami, K., Kučera, A., Wojtczak, D.: One-Counter Markov Decision Processes. In: ACM-SIAM SODA. pp. 863–874 (2010), full tech. report: CoRR, abs/0904.2511 (2009), http://arxiv.org/abs/0904.2511
Brázdil, T., Brožek, V., Forejt, V., Kučera, A.: Reachability in recursive Markov decision processes. In: Baier, C., Hermanns, H. (eds.) CONCUR 2006. LNCS, vol. 4137, pp. 358–374. Springer, Heidelberg (2006)
Chapter Google Scholar
Brázdil, T., Brožek, V., Kučera, A., Obdržálek, J.: Qualitative Reachability in stochastic BPA games. In: Proc. 26th STACS, pp. 207–218 (2009)
Google Scholar
Brázdil, T., Kiefer, S., Kučera, A.: Efficient analysis of probabilistic programs with an unbounded counter. CoRR abs/1102.2529 (2011)
Google Scholar
Etessami, K., Wojtczak, D., Yannakakis, M.: Quasi-birth-death processes, tree-like QBDs, probabilistic 1-counter automata, and pushdown systems. In: Proc. 5th Int. Symp. on Quantitative Evaluation of Systems (QEST), pp. 243–253 (2008)
Google Scholar
Etessami, K., Yannakakis, M.: Recursive Markov decision processes and recursive stochastic games. In: Caires, L., Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds.) ICALP 2005. LNCS, vol. 3580, pp. 891–903. Springer, Heidelberg (2005)
Chapter Google Scholar
Etessami, K., Yannakakis, M.: Efficient qualitative analysis of classes of recursive Markov decision processes and simple stochastic games. In: Durand, B., Thomas, W. (eds.) STACS 2006. LNCS, vol. 3884, Springer, Heidelberg (2006)
Chapter Google Scholar
Grimmett, G.R., Stirzaker, D.R.: Probability and Random Processes, 2nd edn. Oxford U. Press, Oxford (1992)
MATH Google Scholar
Lambert, J., Van Houdt, B., Blondia, C.: A policy iteration algorithm for markov decision processes skip-free in one direction. In: ValueTools. ICST, Brussels (2007)
Google Scholar
Puterman, M.L.: Markov Decision Processes. J. Wiley and Sons, Chichester (1994)
Book MATH Google Scholar
White, L.B.: A new policy iteration algorithm for Markov decision processes with quasi birth-death structure. Stochastic Models 21, 785–797 (2005)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Informatics, Masaryk University, Czech Republic
Tomáš Brázdil & Antonín Kučera
School of Informatics, University of Edinburgh, UK
Václav Brožek & Kousha Etessami

Authors

Tomáš Brázdil
View author publications
You can also search for this author in PubMed Google Scholar
Václav Brožek
View author publications
You can also search for this author in PubMed Google Scholar
Kousha Etessami
View author publications
You can also search for this author in PubMed Google Scholar
Antonín Kučera
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ICE-TCS, School of Computer Science, Reykjavik University, Menntavegur 1, IS 101, Reykjavik, Iceland
Luca Aceto
Fakultät für Informatik, Universität Wien, Universitätsstraße10/9, 1090, Wien, Österreich, Austria
Monika Henzinger
Department of Applied Mathematics, Charles University, Malostranské nám. 25, 118 00, Praha 1, Czech Republic
Jiří Sgall

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brázdil, T., Brožek, V., Etessami, K., Kučera, A. (2011). Approximating the Termination Value of One-Counter MDPs and Stochastic Games. In: Aceto, L., Henzinger, M., Sgall, J. (eds) Automata, Languages and Programming. ICALP 2011. Lecture Notes in Computer Science, vol 6756. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22012-8_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-22012-8_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22011-1
Online ISBN: 978-3-642-22012-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics