Advertisement

Exponential Lower Bounds for Policy Iteration

  • John Fearnley
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6199)

Abstract

We study policy iteration for infinite-horizon Markov decision processes. It has recently been shown policy iteration style algorithms have exponential lower bounds in a two player game setting. We extend these lower bounds to Markov decision processes with the total reward and average-reward optimality criteria.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Andersson, D.: Extending Friedmann’s lower bound to the Hoffman-Karp algorithm. Preprint (June 2009)Google Scholar
  2. 2.
    Andersson, D., Hansen, T.D., Miltersen, P.B.: Toward better bounds on policy iteration. Preprint (June 2009)Google Scholar
  3. 3.
    Ash, R.B., Doléans-Dade, C.A.: Probability and Measure Theory. Academic Press, London (2000)zbMATHGoogle Scholar
  4. 4.
    Friedmann, O.: A super-polynomial lower bound for the parity game strategy improvement algorithm as we know it. In: Logic in Computer Science (LICS). IEEE, Los Alamitos (2009)Google Scholar
  5. 5.
    Howard, R.: Dynamic Programming and Markov Processes. Technology Press and Wiley (1960)Google Scholar
  6. 6.
    Mansour, Y., Singh, S.P.: On the complexity of policy iteration. In: Laskey, K.B., Prade, H. (eds.) UAI 1999: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 401–408. Morgan Kaufmann, San Francisco (1999)Google Scholar
  7. 7.
    Melekopoglou, M., Condon, A.: On the complexity of the policy improvement algorithm for Markov decision processes. ORSA Journal on Computing 6, 188–192 (1994)zbMATHGoogle Scholar
  8. 8.
    Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York (1994)zbMATHGoogle Scholar
  9. 9.
    Vöge, J., Jurdziński, M.: A discrete strategy improvement algorithm for solving parity games (Extended abstract). In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 202–215. Springer, Heidelberg (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • John Fearnley
    • 1
  1. 1.Department of Computer ScienceUniversity of WarwickUK

Personalised recommendations