Advertisement

Strategy Complexity of Finite-Horizon Markov Decision Processes and Simple Stochastic Games

  • Krishnendu Chatterjee
  • Rasmus Ibsen-Jensen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7721)

Abstract

Markov decision processes (MDPs) and simple stochastic games (SSGs) provide a rich mathematical framework to study many important problems related to probabilistic systems. MDPs and SSGs with finite-horizon objectives, where the goal is to maximize the probability to reach a target state in a given finite time, is a classical and well-studied problem. In this work we consider the strategy complexity of finite-horizon MDPs and SSGs. We show that for all ε > 0, the natural class of counter-based strategies require at most \(\log \log (\frac{1}{\epsilon}) + n+1\) memory states, and memory of size \(\Omega(\log \log (\frac{1}{\epsilon}) + n)\) is required, for ε-optimality, where n is the number of states of the MDP (resp. SSG). Thus our bounds are asymptotically optimal. We then study the periodic property of optimal strategies, and show a sub-exponential lower bound on the period for optimal strategies.

Keywords

Optimal Strategy Markov Decision Process Terminal State Stochastic Game Strategy Complexity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Condon, A.: The complexity of stochastic games. Information and Computation 96, 203–224 (1992)MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    Everett, H.: Recursive games. In: Kuhn, H.W., Tucker, A.W. (eds.) Contributions to the Theory of Games Vol. III. Annals of Mathematical Studies, vol. 39. Princeton University Press (1957)Google Scholar
  3. 3.
    Filar, J., Vrieze, K.: Competitive Markov Decision Process, ch. 2.2, pp. 16–22. Springer (1997)Google Scholar
  4. 4.
    Howard, R.A.: Dynamic Programming and Markov Processes. M.I.T. Press (1960)Google Scholar
  5. 5.
    Ibsen-Jensen, R., Miltersen, P.B.: Solving simple stochastic games with few coin toss positions. European Symposia on Algorithms (to appear, 2012)Google Scholar
  6. 6.
    Newman, D.J.: Simple analytic proof of the prime number theorem. The American Mathematical Monthly 87(9), 693–696 (1980)MathSciNetzbMATHCrossRefGoogle Scholar
  7. 7.
    Puterman, M.L.: Markov Decision Processes, ch. 4, pp. 74–118. John Wiley & Sons, Inc. (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Krishnendu Chatterjee
    • 1
  • Rasmus Ibsen-Jensen
    • 2
  1. 1.ISTAustria
  2. 2.Department of Computer ScienceAarhus UniversityDenmark

Personalised recommendations