Mean, variance, and probabilistic criteria in finite Markov decision processes: A review
 D. J. White
 … show all 1 hide
Rent the article at a discount
Rent now* Final gross prices may vary according to local VAT.
Get AccessAbstract
This paper is a survey of papers which make use of nonstandard Markov decision process criteria (i.e., those which do not seek simply to optimize expected returns per unit time or expected discounted return). It covers infinitehorizon nondiscounted formulations, infinitehorizon discounted formulations, and finitehorizon formulations. For problem formulations in terms solely of the probabilities of being in each state and taking each action, policy equivalence results are given which allow policies to be restricted to the class of Markov policies or to the randomizations of deterministic Markov policies. For problems which cannot be stated in such terms, in terms of the primitive state setI, formulations involving a redefinition of the states are examined.
 Markowitz, H.,Portfolio Selection, Wiley, New York, New York, 1959.
 Charnes, A., andCooper, W. W.,Chance Constrained Programming, Management Science, Vol. 6, pp. 73–79, 1959.
 Hogan, A. J., Morris, J. G., andThompson, H. E.,Decision Problems under Risk and Chance Constrained Programming: Dilemmas in the Transition, Management Science, Vol. 27, pp. 698–716, 1981.
 Jacquette, S. C.,A Utility Criterion for Markov Decision Processes, Management Science, Vol. 23, pp. 43–49, 1979.
 Jacquette, S. C.,Markov Decision Processes with a New Optimality Criterion, Small Interest Rates, Annals of Mathematical Statistics, Vol. 1, pp. 1894–1901, 1973.
 Porteus, E. L.,On the Optimality of Structure Policies in Countable Stage Decision Processes, Management Science, Vol. 22, pp. 148–157, 1975.
 White, C. C.,The Optimality of Isotone Strategies for Markov Decision Problems with Utility Criterion, Recent Developments in Markov Decision Processes, Edited by R. Hartley, L. C. Thomas, and D. J. White, Academic Press, New York, New York, 1980.
 Howard, R. A., andMatheson, J. E.,RiskSensitive Markov Decision Processes, Management Science, Vol. 8, pp. 356–369, 1972.
 Kreps, D. M.,Decision Problems with Expected Utility Criteria, I: Upper and Lower Convergent Utility, Mathematics of Operations Research, Vol. 2, pp. 45–53, 1977.
 Kreps, D. M.,Decision Problems with Expected Utility Criteria, II: Stationarity, Mathematics of Operations Research, Vol. 2, pp. 266–274, 1977.
 Rothblum, U. G.,Multiplicative Markov Decision Chains, Mathematics of Operations Research, Vol. 9, pp. 6–24, 1984.
 Sobel, M. J.,Ordinal Dynamic Programming, Management Science, Vol. 21, pp. 967–975, 1975.
 Kallenberg, L. C. M.,Linear Programming and Finite Markovian Control Problems, Mathematisch Centrum, Amsterdam, Holland, 1983.
 Sobel, M. J.,The Variance of Discounted Markov Decision Processes, Journal of Applied Probability, Vol. 19, pp. 774–802, 1982.
 Miller, B.,On Dynamic Programming for a Stochastic Markovian Process with an Application to the Mean Variance Models, Management Science, Vol. 24, p. 1779, 1978.
 White, D. J.,Probabilistic Constraints and Variance in Markov Decision Processes, University of Manchester, Department of Decision Theory, Notes in Decision Theory, No. 149, 1984.
 Derman, C.,Finite State Markovian Decision Processes, Academic Press, New York, New York, 1970.
 Van Der Wal, J.,Stochastic Dynamic Programming, Mathematisch Centrum, Amsterdam, Holland, 1981.
 Derman, C.,On Sequential Control Procedures, Annals of Mathematical Statistics, Vol. 35, pp. 341–349, 1964.
 Derman, C., andStrauch, R.,A Note on Memoryless Rules for Controlling Sequential Control Processes, Annals of Mathematical Statistics, Vol. 37, pp. 276–278, 1966.
 Hartley, R.,Finite, Discounted, Vector Markov Decision Processes, University of Manchester, Department of Decision Theory, Notes in Decision Theory, No. 85, 1979.
 Derman, C.,Stable Sequential Control Rules and Markov Chains, Journal of Mathematical Analysis and Applications, Vol. 6, pp. 257–265, 1963.
 Hordjik, A., andKallenberg, L. C. M.,Constrained Stochastic Dynamic Programming, Mathematics of Operations Research, Vol. 9, pp. 276–289, 1984.
 Derman, C., andVeinott, A. F.,Constrained Markov Decision Chains, Management Science, Vol. 19, pp. 389–390, 1972.
 Strauch, R., andVeinott, A.,A Property of Sequential Control Processes, The Rand Corporation, Santa Monica, California, Research Memorandum No. RM 14772, 1966.
 White, D. J.,Utility, Probabilistic Constraints, Mean, and Variance in Markov Decision Processes, University of Manchester, Notes in Decision Theory, No. 163, 1985.
 Derman, C., andKlein, M.,Some Remarks on FiniteHorizon Markovian Decision Models, Operations Research, Vol. 13, pp. 272–278, 1965.
 White, D. J.,Dynamic Programming with Probabilistic Constraints, Operations Research, Vol. 22, pp. 654–664, 1972.
 Derman, C.,Optimal Replacement under Markovian Deterioration with Probability Bounds on Failure, Management Science, Vol. 9, pp. 478–481, 1963.
 Dantzig, G. B., andWolfe, P.,The Decomposition Algorithm for Linear Programming, Econometrica, Vol. 29, pp. 767–778, 1961.
 Howard, R. A.,Dynamic Programming and Markov Processes, Massachusetts Institute of Technology, PhD Thesis, 1960.
 Filar, J. A., andLee, H. M.,Gain Variability Tradeoffs in Undiscounted Markov Decision Processes, Proceedings of the 24th IEEE Conference on Decision and Control, pp. 1106–1112, 1985.
 White, D. J.,Optimality and Efficiency, Wiley, Now York, New York, 1982.
 Mendelssohn, R.,A Systematic Approach to Determining Mean Variance Tradeoffs when Managing Randomly Varying Populations, Mathematical Biosciences, Vol. 50, pp. 75–84, 1980.
 Filar, J. A.,Percentiles and Markovian Decision Proceesses, Operations Research Letters, Vol. 2, pp. 13–15, 1980.
 White, D. J.,Fundamentals of Decision Theory, NorthHolland, New York, New York, 1976.
 White, D. J.,Minimizing Threshold Probabilities in InfiniteHorizon Discounted Markov Decision Processes, University of Manchester, Department of Decision Theory, Notes in Decision Theory, No. 165, 1985.
 Henig, M.,Optimality in Dynamic Programming with Deterministic Transitions and Stochastic Rewards, Tel Aviv University, Faculty of Management, Working Paper No. 721/82, 1982.
 Henig, M.,Target and Percentile Criteria in Dynamic Programming with Deterministic Transitions and Stochastic Rewards, University of Illinois at UrbanaChampaign, Department of Business Administration, 1984.
 Charnes, A. andCooper, W. W.,Chance Constraints and Normal Deviates, Journal of the American Statistical Association, Vol. 57, pp. 134–148, 1962.
 Goldwerger, J.,Dynamic Programming of a Stochastic Markovian Process with an Application to the Mean Variance Models, Management Science, Vol. 23, pp. 612–620, 1977.
 Parks, M. S., andSteinberg, E.,A Preference Order Dynamic Program for a Knapsack Problem with Stochastic Rewards, Journal of the Operational Research Society, Vol. 30, pp. 141–147, 1979.
 Sneidovitch, M.,Preference Order Stochastic Knapsack Problems: Methodological Issues, Journal of the Operation Research Society, Vol. 31, pp. 1025–1032, 1980.
 Sneidovitch, M.,A Class of Variance Constrained Problems, Operations Research, Vol. 31, pp. 338–353, 1983.
 Greenberg, H.,Dynamic Programming with Linear Uncertainty, Operations Research, Vol. 16, pp. 675–678, 1968.
 Beja, A.,Probability Bounds in Replacement Policies for Markov Systems, Management Science, Vol. 16, pp. 253–264, 1969.
 Bouakiz, M.,Risk Sensitivity in Stochastic Optimization with Applications, Georgia Institute of Technology, PhD Thesis, 1985.
 Chung, K. J.,Some Topics in RiskSensitive Stochastic Dynamic Models, Georgia Institute of Technology, PhD Thesis, 1985.
 Filar, J. A., andLee, H. M.,Gain Variability Tradeoffs in Discounted Markov Decision Processes, Johns Hopkins University, Department of Mathematical Sciences, Technical Report No. 408, 1985.
 Lee, H. M.,Gain Variability Tradeoffs in Markovian Decision Processes and Related Problems, Johns Hopkins University, Department of Mathematical Sciences, PhD Thesis, 1985.
 Sobel, M. J.,MeanVariance Tradeoffs in an Undiscounted MDP, Georgia Institute of Technology, Research Memorandum, 1984.
 Sobel, M. J.,Maximal Mean/Variance Ratio in an Undiscounted MDP, Georgia Institute of Technology, Research Memorandum, 1985.
 Title
 Mean, variance, and probabilistic criteria in finite Markov decision processes: A review
 Journal

Journal of Optimization Theory and Applications
Volume 56, Issue 1 , pp 129
 Cover Date
 19880101
 DOI
 10.1007/BF00938524
 Print ISSN
 00223239
 Online ISSN
 15732878
 Publisher
 Kluwer Academic PublishersPlenum Publishers
 Additional Links
 Topics
 Keywords

 Markov decision processes
 infinite horizon
 finite horizon
 mean
 variance
 probabilistic criteria
 Industry Sectors
 Authors

 D. J. White ^{(1)}
 Author Affiliations

 1. University of Manchester, Manchester, England