Abstract
We study policy iteration for infinite-horizon Markov decision processes. It has recently been shown policy iteration style algorithms have exponential lower bounds in a two player game setting. We extend these lower bounds to Markov decision processes with the total reward and average-reward optimality criteria.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Andersson, D.: Extending Friedmann’s lower bound to the Hoffman-Karp algorithm. Preprint (June 2009)
Andersson, D., Hansen, T.D., Miltersen, P.B.: Toward better bounds on policy iteration. Preprint (June 2009)
Ash, R.B., Doléans-Dade, C.A.: Probability and Measure Theory. Academic Press, London (2000)
Friedmann, O.: A super-polynomial lower bound for the parity game strategy improvement algorithm as we know it. In: Logic in Computer Science (LICS). IEEE, Los Alamitos (2009)
Howard, R.: Dynamic Programming and Markov Processes. Technology Press and Wiley (1960)
Mansour, Y., Singh, S.P.: On the complexity of policy iteration. In: Laskey, K.B., Prade, H. (eds.) UAI 1999: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 401–408. Morgan Kaufmann, San Francisco (1999)
Melekopoglou, M., Condon, A.: On the complexity of the policy improvement algorithm for Markov decision processes. ORSA Journal on Computing 6, 188–192 (1994)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York (1994)
Vöge, J., Jurdziński, M.: A discrete strategy improvement algorithm for solving parity games (Extended abstract). In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 202–215. Springer, Heidelberg (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fearnley, J. (2010). Exponential Lower Bounds for Policy Iteration. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds) Automata, Languages and Programming. ICALP 2010. Lecture Notes in Computer Science, vol 6199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14162-1_46
Download citation
DOI: https://doi.org/10.1007/978-3-642-14162-1_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14161-4
Online ISBN: 978-3-642-14162-1
eBook Packages: Computer ScienceComputer Science (R0)