Advertisement

Abstract

Howard’s algorithm is a fifty-year old generally applicable algorithm for sequential decision making in face of uncertainty. It is routinely used in practice in numerous application areas that are so important that they usually go by their acronyms, e.g., OR, AI, and CAV. While Howard’s algorithm is generally recognized as fast in practice, until recently, its worst case time complexity was poorly understood. However, a surge of results since 2009 has led us to a much more satisfactory understanding of the worst case time complexity of the algorithm in the various settings in which it applies. In this talk, we shall survey these recent results and the open problems that remains.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Blackwell, D.: Discrete dynamic programming. Ann. Math. Stat. 33, 719–726 (1962)MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    Chatterjee, K., de Alfaro, L., Henzinger, T.A.: Strategy improvement for concurrent reachability games. In: Third International Conference on the Quantitative Evaluation of Systems, QEST 2006, pp. 291–300. IEEE Computer Society (2006)Google Scholar
  3. 3.
    Condon, A.: The complexity of stochastic games. Information and Computation 96, 203–224 (1992)MathSciNetzbMATHCrossRefGoogle Scholar
  4. 4.
    Etessami, K., Yannakakis, M.: Recursive Concurrent Stochastic Games. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006, Part II. LNCS, vol. 4052, pp. 324–335. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Fearnley, J.: Exponential Lower Bounds for Policy Iteration. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010, Part II. LNCS, vol. 6199, pp. 551–562. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  6. 6.
    Friedmann, O.: An exponential lower bound for the parity game strategy improvement algorithm as we know it. In: Proceedings of the 24th Annual IEEE Symposium on Logic in Computer Science, LICS 2009, Los Angeles, CA, USA, August 11-14, pp. 145–156 (2009)Google Scholar
  7. 7.
    Hansen, K.A., Ibsen-Jensen, R., Miltersen, P.B.: The Complexity of Solving Reachability Games Using Value and Strategy Iteration. In: Kulikov, A., Vereshchagin, N. (eds.) CSR 2011. LNCS, vol. 6651, pp. 77–90. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  8. 8.
    Hansen, K.A., Koucký, M., Lauritzen, N., Miltersen, P.B., Tsigaridas, E.P.: Exact algorithms for solving stochastic games: extended abstract. In: Proceedings of the 43rd ACM Symposium on Theory of Computing, STOC 2011, San Jose, CA, USA, June 6-8, pp. 205–214. ACM (2011)Google Scholar
  9. 9.
    Hansen, K.A., Koucky, M., Miltersen, P.B.: Winning concurrent reachability games requires doubly exponential patience. In: 24th Annual IEEE Symposium on Logic in Computer Science (LICS 2009), pp. 332–341. IEEE (2009)Google Scholar
  10. 10.
    Hansen, T.D., Miltersen, P.B., Zwick, U.: Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor. In: Innovations in Computer Science - ICS 2010, January 7-9, pp. 253–263. Tsinghua University Press, Beijing (2011)Google Scholar
  11. 11.
    Hansen, T.D., Zwick, U.: Lower Bounds for Howard’s Algorithm for Finding Minimum Mean-Cost Cycles. In: Cheong, O., Chwa, K.-Y., Park, K. (eds.) ISAAC 2010, Part I. LNCS, vol. 6506, pp. 415–426. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  12. 12.
    Hoffman, A.J., Karp, R.M.: On nonterminating stochastic games. Management Science, 359–370 (1966)Google Scholar
  13. 13.
    Howard, R.A.: Dynamic Programming and Markov Processes. MIT Press, Cambridge (1960)zbMATHGoogle Scholar
  14. 14.
    Ibsen-Jensen, R., Miltersen, P.B.: Solving Simple Stochastic Games with Few Coin Toss Positions. In: Epstein, L., Ferragina, P. (eds.) ESA 2012. LNCS, vol. 7501, pp. 636–647. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  15. 15.
    Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York (1994)zbMATHGoogle Scholar
  16. 16.
    Rao, S.S., Chandrasekaran, R., Nair, K.P.K.: Algorithms for discounted games. Journal of Optimization Theory and Applications, 627–637 (1973)Google Scholar
  17. 17.
    Vöge, J., Jurdziński, M.: A Discrete Strategy Improvement Algorithm for Solving Parity Games. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 202–215. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  18. 18.
    Ye,Y.: The simplex and policy-iteration methods are strongly polynomial for the markov decision problem with a fixed discount rate (2010), www.stanford.edu/~yyye/SimplexMDP4.pdf

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Peter Bro Miltersen
    • 1
  1. 1.Department of Computer ScienceAarhus UniversityDenmark

Personalised recommendations