Abstract
This paper gives an overview of linear programming methods for solving standard and nonstandard Markovian control problems. Standard problems are problems with the usual criteria such as expected total (discounted) rewards and average expected rewards; we also discuss a particular class of stochastic games. In nonstandard problems there are additional considerations as side constraints, multiple criteria or mean-variance tradeoffs. In a second companion paper efficient linear programing algorithms are discussed for some applications.
Similar content being viewed by others
References
Beutler FJ, Ross KW (1985) Optimal policies for controlled Markov chains with a constraint. Journal of Mathematical Analysis and Applications 112:236–252
Beutler FJ, Ross KW (1986) Time-average optimal constrained semi-Markov decision processes. Advances in Applied Probability 18:341–359
Bewley T, Kohlberg E (1978) On stochastic games with stationary optimal stategies. Mathematics of Operations Research 3:104–125
Blackwell D (1962) Discrete dynamic programming. Annals of Mathematical Statistics 33:719–726
Dantzig GB (1963) Linear programming and extensions. Princeton University Press, Princeton
De Ghellinck GT (1960) Les problèmes de décisions séquentielles. Cahiers du Centre de Recherche Opérationelle 2:161–179
De Ghellinck GT, Eppen GD (1967) Linear programming solutions for separable Markovian decision problems. Management Science 13:371–394
Denardo EV (1967) Contraction mapping in the theory underlying dynamic programming. SIAM Review 9:165–177
Denardo EV (1970a) On linear programming in a Markov decision problem. Management Science 16:281–288
Denardo EV (1970b) Computing a bias-optimal policy in a discrete-time Markov decision problem. Operations Research 18:279–289
Denardo EV (1973) A Markov decision problem. In: Hu TC, Robinson SM (eds) Mathematical Programming, Academic Press 33–68
Denardo EV, Fox BL (1968) Multichain Markov renewal programs. SIAM Journal on Applied Mathematics 16:468–487
Denardo EV, Miller BL (1968) An optimality condition for discrete dynamic programming with no discounting. Annals of Mathematical Statistics 39:1220–1227
Denardo EV, Rothblum UG (1979) Optimal stopping, exponential utility, and linear programming. Mathematical Programming 16:228–244
D'Epenoux F (1960) Sur un problème de production et de stockage dans l'aléatoire. Revue Française de Recherche Operationelle 14:3–16
D'Epenoux F (1963) A probabilistic production and inventory problem. Management Science 10:98–108
Derman C (1970) Finite state Markovian decision processes. Academic Press, New York
Derman C, Klein M (1965) Some remarks on finite horizon Markovian decision models. Operations Research 13:272–278
Derman C, Strauch R (1966) A note on memoryless rules for controlling sequential control problems. Annals of Mathematical Statistics 37:276–278
Derman C, Veinott AF Jr (1972) Constrained Markov decision chains. Management Science 19:389–390
Dirickx YMI, Rao MR (1979) Linear programming methods for computing gain-optimal policies in Markov decision models. Cahiers du Centre Etudes Recherche Opérationelle 21:133–142
Durinovic S, Filar JA, Katehakis MN, Lee HM (1986) Multi-objective Markov decision process with average reward criterion. Large Scale System 10:215–226
Federgruen A, Schweitzer PJ (1980) A survey of asymptotic value-iteration for undiscounted Markovian decision processes 73–110. In: Hartley R, Thomas LC, White DJ (eds) Recent developments in Markov decision processes, Academic Press
Filar JA (1980) Algorithms for solving some undiscounted stochastic games. PhD Thesis, University of Illinois at Chicago
Filar JA, Kallenberg LCM, Lee HM (1989) Variance-penalized Markov decision Processes. Mathematics of Operations Research 14:147–161.
Filar JA, Lee HM (1985) Gain/Variability tradeoffs in undiscounted Markov decision Processes. Proceedings of 24th Conference on Decision and Control IEEE 1106–1112
Filar JA, Schultz TA (1988) Communicating MDP's: equivalence and LP properties. OR Letters 7:303–307
Gillette D (1957) Stochastic games with zero stop probabilities. In: Dresher M, Tucker AW, Wolfe P (eds) Contributions to the theory of games, Volume 3, Annals of Mathematical Studies 39, Princeton University Press 179–189
Heilmann W-R (1977) Lineare Programmierung stochastischer dynamischer Entscheidungsmodelle. Thesis, Universität Hamburg
Heilmann W-R (1978) Solving stochastic dynamic programming by linear programming — An annotated bibliography. Zeitschrift für Operations Research 22:43–53
Hordijk A (1978) From linear to dynamic programming via shortest paths. In: Baayen, PC et al. (eds) Proceedings of the Bicentennial Congress of the Wiskundig Genootschap, Mathematical Centre, Amsterdam
Hordijk A, Dekker R, Kallenberg LCM (1985) Sensitivity analysis in discounted Markov decision problems. OR Spektrum 7:143–151
Hordijk A, Kallenberg LCM (1979) Linear programming and Markov decision chains. Management Science 25:352–362
Hordijk A, Kallenberg LCM (1980) Linear programming methods for solving finite Markovian decision problems. In: Fandel G et al. (eds) Operations Research Proceedings 1980 468–482
Hordijk A, Kallenberg LCM (1981a) Linear programming and Markov games I. In: Moeschlin O, Pallaschke D (eds) Game Theory and Mathematical Economics. North Holland, Amsterdam 291–305
Hordijk A, Kallenberg LCM (1981b) Linear programming and Markov games II. In: Moeschlin O, Pallaschke D (eds) Game Theory and Mathematical Economics. North Holland, Amsterdam 307–320
Hordijk A, Kallenberg LCM (1984a) Transient policies in discrete dynamic programming: linear programming including suboptimality tests and additional constraints. Mathematical Programming 30:46–70
Hordijk A, Kallenberg LCM (1984b) Constrained undiscounted dynamic programming. Mathematics of Operations Research 9:276–289
Howard RA (1960) Dynamic Programming and Markov Processes. Wiley, New York
Huang Y, Kallenberg LCM (1994) On finding optimal policies for Markov decision chains: A unifying framework for mean-variance tradeoffs. Mathematics of Operations Research 19
Kallenberg LCM (1981a) Unconstrained and constrained dynamic programming over a finite horizon. Report, University of Leiden, The Netherlands
Kallenberg LCM (1981b) Linear programming to compute a bias-optimal policy. In: B. Fleischmenn et al. (eds) Operations Research Proceedings 433–440
Kallenberg LCM (1983) Linear programming and finite Markovian control problems. Mathematical Centre Tracts # 148, Amsterdam
Kallenberg LCM (1994) Efficient algorithms to determine the classification of a Markov decision problem. Report, University of Leiden, The Netherlands
Kawai H (1987) A variance minimization problem for a Markov decision process. European Journal of operational Research 31:140–145
Manne AS (1960) Linear programming and sequential decisions. Management Science 6:259–267
Mertens JF, Neyman A (1981) Stochastic games. International Journal of Game Theory 10:53–56
Miller BL, Veinott AF Jr (1969) Discrete dynamic programming with a small interest rate. Annals of Mathematical Statistics 40:366–370
Mine H, Osaki S (1970) Markovian decision processes. Elsevier, New York
Nazareth JL, Kulkarni RB (1986) Linear programming formulations of Markov decision processes. OR Letters 5:13–16
Osaki S, Mine H (1969) Linear programming considerations on Markovian decision processes with no discounting. Journal of Mathematical Analysis and Applications 26:221–232
Parthasarathy T, Raghavan TES (1981) An ordered field property for stochastic games when one player controls transition probabilities. Journal of Optimization Theory and Applications 33:375–392
Porteus EL (1980) Overview of iterative methods for discounted finite Markov and semi-Markov decision chains. In: Hartley R, Thomas LC, White DJ (eds) Recent developments in Markov decision processes, Academic Press 1–20
Puterman ML (1988) Markov decision processes: a survey. In: Heyman DP, Sobel MJ (eds) Handbook of Operations Research and Management Science, Volume 2: Stochastic Models. North Holland, Amsterdam 331–434
Puterman ML, Brumelle SL (1979) On the convergence of policy iteration in stationary dynamic programming. Mathematics of Operations Research 4:60–69
Raghavan TES, Filar JA (1991) Algorithms for stochastic games — A survey. Zeitschrift für Operations Research 35:437–472
Ross SM (1983) Introduction to stochastic programming. Academic Press
Rothblum UG (1979) Solving stopping stochastic games by maximizing a linear function subject to quadratic constraints. In: Moeschlin O, Pallaschke D (eds) Game Theory and Mathematical Economics. North Holland, Amsterdam 103–105
Shapley LS (1953) Stochastic games. Proceedings National Academy of Sciences USA 39:1095–1100
Sobel MJ (1985) Maximal mean/standard deviation ratio in an undiscounted MDP. OR Letters 4:157–159
Stein J (1988) On efficiency of linear programming applied to discounted Markovian decision problems. OR Spektrum 10:153–160
Sutherland WRS (1980) Optimality in transient Markov chains and linear programming. Mathematical Programming 18:1–6
Van Nunen, JAEE (1976) Contracting Markov decision processes. Mathematical Centre Tract #71, Amsterdam
Veinott AF Jr (1966) On finding optimal policies in discrete dynamic programming with no discounting. Annals of Mathematical Statistics 37:1284–1294
Veinott AF Jr (1969) Discrete dynamic programming with sensitive discount optimality criteria. Annals of Mathematical Statistics 40:1635–1660
Veinott AF Jr (1974) Markov decision chains. In: GB Dantzig, BC Eaves (eds) Studies in Mathematics, Volume 10: Studies in Optimization. The Mathematical Association of America 124–159
Vrieze OJ (1981) Linear programming and undiscounted stochastic games in which one player controls the transitions. OR Spektrum 3:29–35
Wessels J, Van Nunen JAEE (1975) Discounted semi-Markov decision processes: linear programming and policy iteration. Statistica Neerlandica 29:1–7
White, DJ (1988) Mean, variance and probabilistic criteria in finite Markov decision processes: a review. Journal of Optimization Theory and Applications 56:1–30
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Kallenberg, L.C.M. Survey of linear programming for standard and nonstandard Markovian control problems. Part I: Theory. ZOR - Methods and Models of Operations Research 40, 1–42 (1994). https://doi.org/10.1007/BF01414028
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01414028