Abstract
Markov decision processes (MDPs) and simple stochastic games (SSGs) provide a rich mathematical framework to study many important problems related to probabilistic systems. MDPs and SSGs with finite-horizon objectives, where the goal is to maximize the probability to reach a target state in a given finite time, is a classical and well-studied problem. In this work we consider the strategy complexity of finite-horizon MDPs and SSGs. We show that for all ε > 0, the natural class of counter-based strategies require at most \(\log \log (\frac{1}{\epsilon}) + n+1\) memory states, and memory of size \(\Omega(\log \log (\frac{1}{\epsilon}) + n)\) is required, for ε-optimality, where n is the number of states of the MDP (resp. SSG). Thus our bounds are asymptotically optimal. We then study the periodic property of optimal strategies, and show a sub-exponential lower bound on the period for optimal strategies.
Work of the second author supported by the Sino-Danish Center for the Theory of Interactive Computation, funded by the Danish National Research Foundation and the National Science Foundation of China (under the grant 61061130540). The second author acknowledge support from the Center for research in the Foundations of Electronic Markets (CFEM), supported by the Danish Strategic Research Council. The first author was supported by FWF Grant No P 23499-N23, FWF NFN Grant No S11407-N23 (RiSE), ERC Start grant (279307: Graph Games), and Microsoft faculty fellows award.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Condon, A.: The complexity of stochastic games. Information and Computation 96, 203–224 (1992)
Everett, H.: Recursive games. In: Kuhn, H.W., Tucker, A.W. (eds.) Contributions to the Theory of Games Vol. III. Annals of Mathematical Studies, vol. 39. Princeton University Press (1957)
Filar, J., Vrieze, K.: Competitive Markov Decision Process, ch. 2.2, pp. 16–22. Springer (1997)
Howard, R.A.: Dynamic Programming and Markov Processes. M.I.T. Press (1960)
Ibsen-Jensen, R., Miltersen, P.B.: Solving simple stochastic games with few coin toss positions. European Symposia on Algorithms (to appear, 2012)
Newman, D.J.: Simple analytic proof of the prime number theorem. The American Mathematical Monthly 87(9), 693–696 (1980)
Puterman, M.L.: Markov Decision Processes, ch. 4, pp. 74–118. John Wiley & Sons, Inc. (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chatterjee, K., Ibsen-Jensen, R. (2013). Strategy Complexity of Finite-Horizon Markov Decision Processes and Simple Stochastic Games. In: Kučera, A., Henzinger, T.A., Nešetřil, J., Vojnar, T., Antoš, D. (eds) Mathematical and Engineering Methods in Computer Science. MEMICS 2012. Lecture Notes in Computer Science, vol 7721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36046-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-36046-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36044-2
Online ISBN: 978-3-642-36046-6
eBook Packages: Computer ScienceComputer Science (R0)