Skip to main content
Log in

Survey of linear programming for standard and nonstandard Markovian control problems. Part I: Theory

  • Articles
  • Published:
Zeitschrift für Operations Research Aims and scope Submit manuscript

Abstract

This paper gives an overview of linear programming methods for solving standard and nonstandard Markovian control problems. Standard problems are problems with the usual criteria such as expected total (discounted) rewards and average expected rewards; we also discuss a particular class of stochastic games. In nonstandard problems there are additional considerations as side constraints, multiple criteria or mean-variance tradeoffs. In a second companion paper efficient linear programing algorithms are discussed for some applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Beutler FJ, Ross KW (1985) Optimal policies for controlled Markov chains with a constraint. Journal of Mathematical Analysis and Applications 112:236–252

    Google Scholar 

  • Beutler FJ, Ross KW (1986) Time-average optimal constrained semi-Markov decision processes. Advances in Applied Probability 18:341–359

    Google Scholar 

  • Bewley T, Kohlberg E (1978) On stochastic games with stationary optimal stategies. Mathematics of Operations Research 3:104–125

    Google Scholar 

  • Blackwell D (1962) Discrete dynamic programming. Annals of Mathematical Statistics 33:719–726

    Google Scholar 

  • Dantzig GB (1963) Linear programming and extensions. Princeton University Press, Princeton

    Google Scholar 

  • De Ghellinck GT (1960) Les problèmes de décisions séquentielles. Cahiers du Centre de Recherche Opérationelle 2:161–179

    Google Scholar 

  • De Ghellinck GT, Eppen GD (1967) Linear programming solutions for separable Markovian decision problems. Management Science 13:371–394

    Google Scholar 

  • Denardo EV (1967) Contraction mapping in the theory underlying dynamic programming. SIAM Review 9:165–177

    Google Scholar 

  • Denardo EV (1970a) On linear programming in a Markov decision problem. Management Science 16:281–288

    Google Scholar 

  • Denardo EV (1970b) Computing a bias-optimal policy in a discrete-time Markov decision problem. Operations Research 18:279–289

    Google Scholar 

  • Denardo EV (1973) A Markov decision problem. In: Hu TC, Robinson SM (eds) Mathematical Programming, Academic Press 33–68

  • Denardo EV, Fox BL (1968) Multichain Markov renewal programs. SIAM Journal on Applied Mathematics 16:468–487

    Google Scholar 

  • Denardo EV, Miller BL (1968) An optimality condition for discrete dynamic programming with no discounting. Annals of Mathematical Statistics 39:1220–1227

    Google Scholar 

  • Denardo EV, Rothblum UG (1979) Optimal stopping, exponential utility, and linear programming. Mathematical Programming 16:228–244

    Google Scholar 

  • D'Epenoux F (1960) Sur un problème de production et de stockage dans l'aléatoire. Revue Française de Recherche Operationelle 14:3–16

    Google Scholar 

  • D'Epenoux F (1963) A probabilistic production and inventory problem. Management Science 10:98–108

    Google Scholar 

  • Derman C (1970) Finite state Markovian decision processes. Academic Press, New York

    Google Scholar 

  • Derman C, Klein M (1965) Some remarks on finite horizon Markovian decision models. Operations Research 13:272–278

    Google Scholar 

  • Derman C, Strauch R (1966) A note on memoryless rules for controlling sequential control problems. Annals of Mathematical Statistics 37:276–278

    Google Scholar 

  • Derman C, Veinott AF Jr (1972) Constrained Markov decision chains. Management Science 19:389–390

    Google Scholar 

  • Dirickx YMI, Rao MR (1979) Linear programming methods for computing gain-optimal policies in Markov decision models. Cahiers du Centre Etudes Recherche Opérationelle 21:133–142

    Google Scholar 

  • Durinovic S, Filar JA, Katehakis MN, Lee HM (1986) Multi-objective Markov decision process with average reward criterion. Large Scale System 10:215–226

    Google Scholar 

  • Federgruen A, Schweitzer PJ (1980) A survey of asymptotic value-iteration for undiscounted Markovian decision processes 73–110. In: Hartley R, Thomas LC, White DJ (eds) Recent developments in Markov decision processes, Academic Press

  • Filar JA (1980) Algorithms for solving some undiscounted stochastic games. PhD Thesis, University of Illinois at Chicago

  • Filar JA, Kallenberg LCM, Lee HM (1989) Variance-penalized Markov decision Processes. Mathematics of Operations Research 14:147–161.

    Google Scholar 

  • Filar JA, Lee HM (1985) Gain/Variability tradeoffs in undiscounted Markov decision Processes. Proceedings of 24th Conference on Decision and Control IEEE 1106–1112

  • Filar JA, Schultz TA (1988) Communicating MDP's: equivalence and LP properties. OR Letters 7:303–307

    Google Scholar 

  • Gillette D (1957) Stochastic games with zero stop probabilities. In: Dresher M, Tucker AW, Wolfe P (eds) Contributions to the theory of games, Volume 3, Annals of Mathematical Studies 39, Princeton University Press 179–189

  • Heilmann W-R (1977) Lineare Programmierung stochastischer dynamischer Entscheidungsmodelle. Thesis, Universität Hamburg

  • Heilmann W-R (1978) Solving stochastic dynamic programming by linear programming — An annotated bibliography. Zeitschrift für Operations Research 22:43–53

    Google Scholar 

  • Hordijk A (1978) From linear to dynamic programming via shortest paths. In: Baayen, PC et al. (eds) Proceedings of the Bicentennial Congress of the Wiskundig Genootschap, Mathematical Centre, Amsterdam

    Google Scholar 

  • Hordijk A, Dekker R, Kallenberg LCM (1985) Sensitivity analysis in discounted Markov decision problems. OR Spektrum 7:143–151

    Google Scholar 

  • Hordijk A, Kallenberg LCM (1979) Linear programming and Markov decision chains. Management Science 25:352–362

    Google Scholar 

  • Hordijk A, Kallenberg LCM (1980) Linear programming methods for solving finite Markovian decision problems. In: Fandel G et al. (eds) Operations Research Proceedings 1980 468–482

  • Hordijk A, Kallenberg LCM (1981a) Linear programming and Markov games I. In: Moeschlin O, Pallaschke D (eds) Game Theory and Mathematical Economics. North Holland, Amsterdam 291–305

    Google Scholar 

  • Hordijk A, Kallenberg LCM (1981b) Linear programming and Markov games II. In: Moeschlin O, Pallaschke D (eds) Game Theory and Mathematical Economics. North Holland, Amsterdam 307–320

    Google Scholar 

  • Hordijk A, Kallenberg LCM (1984a) Transient policies in discrete dynamic programming: linear programming including suboptimality tests and additional constraints. Mathematical Programming 30:46–70

    Google Scholar 

  • Hordijk A, Kallenberg LCM (1984b) Constrained undiscounted dynamic programming. Mathematics of Operations Research 9:276–289

    Google Scholar 

  • Howard RA (1960) Dynamic Programming and Markov Processes. Wiley, New York

    Google Scholar 

  • Huang Y, Kallenberg LCM (1994) On finding optimal policies for Markov decision chains: A unifying framework for mean-variance tradeoffs. Mathematics of Operations Research 19

  • Kallenberg LCM (1981a) Unconstrained and constrained dynamic programming over a finite horizon. Report, University of Leiden, The Netherlands

    Google Scholar 

  • Kallenberg LCM (1981b) Linear programming to compute a bias-optimal policy. In: B. Fleischmenn et al. (eds) Operations Research Proceedings 433–440

  • Kallenberg LCM (1983) Linear programming and finite Markovian control problems. Mathematical Centre Tracts # 148, Amsterdam

    Google Scholar 

  • Kallenberg LCM (1994) Efficient algorithms to determine the classification of a Markov decision problem. Report, University of Leiden, The Netherlands

    Google Scholar 

  • Kawai H (1987) A variance minimization problem for a Markov decision process. European Journal of operational Research 31:140–145

    Google Scholar 

  • Manne AS (1960) Linear programming and sequential decisions. Management Science 6:259–267

    Google Scholar 

  • Mertens JF, Neyman A (1981) Stochastic games. International Journal of Game Theory 10:53–56

    Google Scholar 

  • Miller BL, Veinott AF Jr (1969) Discrete dynamic programming with a small interest rate. Annals of Mathematical Statistics 40:366–370

    Google Scholar 

  • Mine H, Osaki S (1970) Markovian decision processes. Elsevier, New York

    Google Scholar 

  • Nazareth JL, Kulkarni RB (1986) Linear programming formulations of Markov decision processes. OR Letters 5:13–16

    Google Scholar 

  • Osaki S, Mine H (1969) Linear programming considerations on Markovian decision processes with no discounting. Journal of Mathematical Analysis and Applications 26:221–232

    Google Scholar 

  • Parthasarathy T, Raghavan TES (1981) An ordered field property for stochastic games when one player controls transition probabilities. Journal of Optimization Theory and Applications 33:375–392

    Google Scholar 

  • Porteus EL (1980) Overview of iterative methods for discounted finite Markov and semi-Markov decision chains. In: Hartley R, Thomas LC, White DJ (eds) Recent developments in Markov decision processes, Academic Press 1–20

  • Puterman ML (1988) Markov decision processes: a survey. In: Heyman DP, Sobel MJ (eds) Handbook of Operations Research and Management Science, Volume 2: Stochastic Models. North Holland, Amsterdam 331–434

    Google Scholar 

  • Puterman ML, Brumelle SL (1979) On the convergence of policy iteration in stationary dynamic programming. Mathematics of Operations Research 4:60–69

    Google Scholar 

  • Raghavan TES, Filar JA (1991) Algorithms for stochastic games — A survey. Zeitschrift für Operations Research 35:437–472

    Google Scholar 

  • Ross SM (1983) Introduction to stochastic programming. Academic Press

  • Rothblum UG (1979) Solving stopping stochastic games by maximizing a linear function subject to quadratic constraints. In: Moeschlin O, Pallaschke D (eds) Game Theory and Mathematical Economics. North Holland, Amsterdam 103–105

    Google Scholar 

  • Shapley LS (1953) Stochastic games. Proceedings National Academy of Sciences USA 39:1095–1100

    Google Scholar 

  • Sobel MJ (1985) Maximal mean/standard deviation ratio in an undiscounted MDP. OR Letters 4:157–159

    Google Scholar 

  • Stein J (1988) On efficiency of linear programming applied to discounted Markovian decision problems. OR Spektrum 10:153–160

    Google Scholar 

  • Sutherland WRS (1980) Optimality in transient Markov chains and linear programming. Mathematical Programming 18:1–6

    Google Scholar 

  • Van Nunen, JAEE (1976) Contracting Markov decision processes. Mathematical Centre Tract #71, Amsterdam

    Google Scholar 

  • Veinott AF Jr (1966) On finding optimal policies in discrete dynamic programming with no discounting. Annals of Mathematical Statistics 37:1284–1294

    Google Scholar 

  • Veinott AF Jr (1969) Discrete dynamic programming with sensitive discount optimality criteria. Annals of Mathematical Statistics 40:1635–1660

    Google Scholar 

  • Veinott AF Jr (1974) Markov decision chains. In: GB Dantzig, BC Eaves (eds) Studies in Mathematics, Volume 10: Studies in Optimization. The Mathematical Association of America 124–159

  • Vrieze OJ (1981) Linear programming and undiscounted stochastic games in which one player controls the transitions. OR Spektrum 3:29–35

    Google Scholar 

  • Wessels J, Van Nunen JAEE (1975) Discounted semi-Markov decision processes: linear programming and policy iteration. Statistica Neerlandica 29:1–7

    Google Scholar 

  • White, DJ (1988) Mean, variance and probabilistic criteria in finite Markov decision processes: a review. Journal of Optimization Theory and Applications 56:1–30

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kallenberg, L.C.M. Survey of linear programming for standard and nonstandard Markovian control problems. Part I: Theory. ZOR - Methods and Models of Operations Research 40, 1–42 (1994). https://doi.org/10.1007/BF01414028

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01414028

Keywords

Navigation