Abstract
We consider the subclass of linear programs that formulate Markov Decision Processes (mdps). We show that the Simplex algorithm with the Gass-Saaty shadow-vertex pivoting rule is strongly polynomial for a subclass of mdps, called controlled random walks (CRWs); the running time is O(|S|3⋅|U|2), where |S| denotes the number of states and |U| denotes the number of actions per state. This result improves the running time of Zadorojniy et al. (Mathematics of Operations Research 34(4):992–1007, 2009) algorithm by a factor of |S|. In particular, the number of iterations needed by the Simplex algorithm for CRWs is linear in the number of states and does not depend on the discount factor.
Similar content being viewed by others
References
Amenta, N., & Ziegler, G. M. (1996). Advances in discrete and computational geometry. In Contemporary mathematics: Vol. 223. Deformed products and maximal shadows of polytopes. Providence: Am. Math. Soc.
Barasz, M., & Vempala, S. (2010). A new approach to strongly polynomial linear programming. In Innovations in computer science (pp. 42–48).
Borgwardt, K. H. (1982a). The average number of pivot steps required by the simplex-method is polynomial. Mathematical Methods of Operations Research, 26(1), 157–177.
Borgwardt, K. H. (1982). Some distribution-independent results about the asymptotic order of the average number of pivot steps of the simplex method. Mathematics of Operations Research, 7(3), 441–462.
de Ghellinck, G. (1960). Les problemes de decisions sequentielles. Cahiers Du Centre D’études de Recherche Opérationnelle, 2, 161–179.
d’Epenoux, F. (1963). A probabilistic production and inventory problem. Management Science, 10(1), 98–108.
Derman, C. (1962). On sequential decisions and markov chains. Management Science, 9(1), 16–24.
Gass, S., & Saaty, T. (1955). The computational algorithm for the parametric objective function. Naval Research Logistics Quarterly, 2(1–2), 39–45.
Kitaev, M. Y., & Rykov, V. V. (1995). Controlled queueing systems. Boca Raton: CRC Press.
Kleinrock, L. (1975). Queueing systems, Vol. I: Theory. New York: Wiley.
Manne, A. S. (1960). Linear programming and sequential decisions. Management Science, 6(3), 259–267.
Matoušek, J., & Gärtner, B. (2007). Understanding and using linear programming. Berlin: Springer.
Megiddo, N. (1984). Linear programming in linear time when the dimension is fixed. Journal of the ACM, 31(1), 114–127.
Megiddo, N. (1987). On the complexity of linear programming. In T. Bewley (Ed.), Advances in economic theory: fifth world congress (pp. 225–268). Cambridge: Cambridge University Press.
Megiddo, N., & Chandrasekaran, R. (1989). On the ε-perturbation method for avoiding degeneracy. Operations Research Letters, 8(6), 305–308.
Meyn, S. P. (2008). Control techniques for complex networks. Cambridge: Cambridge University Press.
Puterman, M. L. (1994). Markov decision processes: discrete stochastic dynamic programming. New York: Wiley.
Schrijver, A. (1998). Theory of linear and integer programming. New York: Wiley.
Serfozo, R. F. (1979). An equivalence between continuous and discrete time markov decision processes. Operations Research, 27(3), 616–620.
Spielman, D. A., & Teng, S. H. (2004). Smoothed analysis of algorithms: why the simplex algorithm usually takes polynomial time. Journal of the ACM, 51(3), 385–463.
Tardos, E. (1985). A strongly polynomial minimum cost circulation algorithm. Combinatorica, 5(3), 247–255.
Tardos, E. (1986). A strongly polynomial algorithm to solve combinatorial linear programs. Operations Research, 34(2), 250–256.
Terlaky, T., & Zhang, S. (1993). Pivot rules for linear programming: a survey on recent theoretical developments. Annals of Operations Research, 46(1), 203–233.
Vershynin, R. (2006). Beyond Hirsch conjecture: walks on random polytopes and smoothed complexity of the simplex method. In FOCS’06. 47th annual IEEE symposium on foundations of computer science, 2006 (pp. 133–142). New York: IEEE Press.
Yadin, M., & Naor, P. (1967). On queueing systems with variable service capacities. Naval Research Logistics Quarterly, 14, 43–53.
Ye, Y. (2005). A new complexity result on solving the Markov decision problem. Mathematics of Operations Research, 30(3), 733–749.
Ye, Y. (2010). The simplex and policy-iteration methods are strongly polynomial for the Markov decision problem with a fixed discount rate. Seminar, talk.
Ye, Y. (2011). The simplex and policy-iteration methods are strongly polynomial for the Markov decision problem with a fixed discount rate. Mathematics of Operations Research, 36(4), 593–603.
Zadorojniy, A., & Even, G. Hyperbolic behavior of occupation measures between neighboring policies in CMDPs. http://www.eng.tau.ac.il/~sasha/.
Zadorojniy, A., Even, G., & Shwartz, A. (2009). A strongly polynomial algorithm for controlled queues. Mathematics of Operations Research, 34(4), 992–1007.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Even, G., Zadorojniy, A. Strong polynomiality of the Gass-Saaty shadow-vertex pivoting rule for controlled random walks. Ann Oper Res 201, 159–167 (2012). https://doi.org/10.1007/s10479-012-1199-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-012-1199-x