Handbook of Markov Decision Processes pp 209-229 | Cite as

# Mixed Criteria

Chapter

## Abstract

Mixed criteria are linear combinations of standard criteria which cannot be represented as standard criteria. Linear combinations of total discounted and average rewards as well as linear combinations of total discounted rewards with different discount factors are examples of mixed criteria. We discuss the structure of optimal policies and algorithms for their computation for problems with and without constraints.

## Keywords

Stationary Policy Optimal Policy Discount Factor MARKOV Decision Process Stochastic Game
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## Preview

Unable to display preview. Download preview PDF.

## References

- [1]E. Altman,
*Constrained Markov Decision Processes*, Chapman & Hall/CRC, London, 1999.Google Scholar - [2]E. Altman, E. Feinberg, J.A. Filar, and V.A. Gaitsgory, “Perturbed zero-sum games with applications to dynamic games,”
*Annals of the International Society of Dynamic Games***6**pp. 165–181, 2001.Google Scholar - [3]E. Altman, E. A. Feinberg, and A. Shwartz, “Weighted discounted stochastic games with perfect information,”
*Annals of the International Society of Dynamic Games***5**pp. 303–323, 2000.Google Scholar - [4]E. Altman and A. Shwartz, “Sensitivity of constrained Markov decision processes,”
*Ann. Operations Research***32**pp. 1–22, 1994.CrossRefGoogle Scholar - [5]E. Altman and A. Shwartz, “Constrained Markov games: Nash equilibria,”
*Annals of the International Society of Dynamic Games***5**pp. 213–221, 2000.Google Scholar - [6]V.S. Borkar,
*Topics in Controlled Markov Chains*, Longman Scientific & Technical, Harlow, 1991.Google Scholar - [7]R. Ya. Chitashvili, “A finite controlled Markov chain with small break probability,”
*SIAM Theory Probability Appl.***21**pp. 157–163, 1976.Google Scholar - [8]C. Derman and R.E. Strauch, “A note on memoryless rules for controlling sequential processes,”
*Ann. Math. Stat.***37**pp. 272–278, 1966.Google Scholar - [9]A. Federgruen, “On
*N*-person stochastic games with denumerable state spaces,”*Ad. Appl. Prob***10**pp. 452–471, 1978.CrossRefGoogle Scholar - [10]E.A. Feinberg, “Controlled Markov processes with arbitrary numerical criteria,”
*SIAM Theory Probability Appl.***27**pp. 486–503, 1982.CrossRefGoogle Scholar - [11]
- [12]E.A. Feinberg, “Constrained discounted Markov decision processes and Hamiltonian cycles,”
*Math, of Operations Research*,**25**pp. 130–140, 2000.CrossRefGoogle Scholar - [13]E.A. Feinberg, “Continuous Time Discounted Jump Markov Decision Processes: Discrete-Event Approach,” State University of New York at Stony Brook, Preprint, 1998.Google Scholar
- [14]E.A. Feinberg and A. Shwartz, “Markov decision models with weighted discounted criteria,”
*Math. of Operations Research***19**pp. 152–168, 1994.CrossRefGoogle Scholar - [15]E.A. Feinberg and A. Shwartz, “Constrained Markov decision models with weighted discounted rewards,”
*Math, of Operations Research***20**pp. 302–320, 1995.CrossRefGoogle Scholar - [16]E.A. Feinberg and A. Shwartz, “Constrained discounted dynamic programming,”
*Math. of Operations Research***21**pp. 922–945, 1996.CrossRefGoogle Scholar - [17]E.A. Feinberg and A. Shwartz, “Constrained dynamic programming with two discount factors: applications and an algorithm,”
*IEEE Transactions on Automatic Control***TAC-44**pp. 628–630, 1999.CrossRefGoogle Scholar - [18]E. Fernandez-Gaucherand, M.K. Ghosh and S.I. Marcus, “Controlled Markov processes on the infinite planning horizon: weighted and overtaking cost criteria,”
*ZOR—Methods and Models of Operations Research***39**pp. 131–155, 1994.Google Scholar - [19]J. Filar and O. Vrieze, “Weighted reward criteria in competitive Markov decision programming problems,”
*ZOR—Methods and Models of Operations Research***36**pp. 343–358, 1992.Google Scholar - [20]M.K. Ghosh and S.I. Marcus, “Infinite horizon controlled diffusion problems with some nonstandard criteria,” J. Math. Systems, Estimation and Control
**1**pp. 45–69, 1991.Google Scholar - [21]K. Golabi, Ram B. Kulkarni and G.B. Way, “A statewide pavement management system,”
*Interfaces***12**pp. 5–21, 1982.CrossRefGoogle Scholar - [22]A. Haurie and P. L’Ecuyer, “Approximation and bounds in discrete event dynamic programming,”
*IEEE Transactions on Automatic Control***AC-31**pp. 227–235, 1986.CrossRefGoogle Scholar - [23]O. Hernandez-Lerma and J. Lasserre,
*Discrete-Time Markov Control Processes*, Springer, New York, 1996.CrossRefGoogle Scholar - [24]O. Hernandez-Lerma and J. Lasserre,
*Future Topics on Discrete-Time Markov Control Processes*, Springer, New York, 1999.CrossRefGoogle Scholar - [25]O. Hernandez-Lerma and R. Romera,
*Pareto Optimality in Multiobjective Markov Control Processes*, Preprint, 2000.Google Scholar - [26]A. Hordijk,
*Dynamic Programming and Markov Potential Theory*, Math. Centre Tracts**51**, Math. Centrum, Amsterdam, 1974.Google Scholar - [27]K. Hinderer,
*Foundations of Non Stationary Dynamic Programming with Discrete Time Parameter*, Lecture Notes in Operations Research**33**, Springer-Verlag, NY, 1970.CrossRefGoogle Scholar - [28]L.C.M. Kallenberg, Linear Programming and Finite Markovian Problem, Math. Centre Tracts 148, Math. Centrum, Amsterdam, 1983.Google Scholar
- [29]D. Krass, J. Filar and S.S. Sinha, “A weighted Markov decision process,”
*Oper. Res.***40**pp. 1180–1187, 1992.CrossRefGoogle Scholar - [30]A.P. Maitra and T. Parthasarathy, “On stochastic games,”
*Journal of Optimization Theory and Applications***5**pp. 289–300, 1970.CrossRefGoogle Scholar - [31]A.P. Maitra and W.D. Sudderth, “An operator solution of stochastic games,”
*Israel Journal of Mathematics***78**pp. 33–49, 1992.CrossRefGoogle Scholar - [32]A.P. Maitra and W.D. Sudderth, “Borel stochastic games with limsup payoff,”
*Annals of Probability***21**pp. 861–885, 1993CrossRefGoogle Scholar - [33]A.P. Maitra and W.D. Sudderth,
*Discrete Gambling and Stochastic Games*, Springer, New York, 1996.CrossRefGoogle Scholar - [34]A.S. Nowak, “Universally measurable strategies in zero-sum stochastic games,”
*Annals of Probability***13**pp. 269–287, 1985.CrossRefGoogle Scholar - [35]T. Parthasarathy and E.S. Raghavan,
*Some Topics in Two-Person Games*, Elsevier, New York, 1967.Google Scholar - [36]A.B. Piunovskiy,
*Optimal Control of Random Sequences in Problems with Constraints*, Kluwer, Boston, 1997.CrossRefGoogle Scholar - [37]M.L. Puterman,
*Markov Decision Processes*, Wiley, New York, 1994.CrossRefGoogle Scholar - [38]M.I. Reiman and A. Shwartz, “Call Admission: A new Approach to Quality of Service,” to appear, QUESTA.Google Scholar
- [39]K.W. Ross, “Randomized and past dependent policies for Markov decision processes with finite action sets,”
*Oper. Res.***37**pp. 474–477, 1989.CrossRefGoogle Scholar - [40]K.W. Ross and R. Varadarajan, “Multichain Markov decision processes with a sample path constraint: a decomposition approach,”
*Math. Operations Research***16**pp. 195–207, 1991.CrossRefGoogle Scholar - [41]M. Schäl, “Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal,”
*Z. Wahr. verw. Gebiete***32**pp. 179–196, 1975.CrossRefGoogle Scholar - [42]A. Shwartz, “Death and Discounting,”
*IEEE Trans, on Auto. Control***46**, pp. 644–647, 2001.CrossRefGoogle Scholar - [43]S. Stidham, “On the convergence of successive approximations in dynamic programming with non-zero terminal rewards,”
*Z. Operations Res.***25**pp. 57–77, 1981.Google Scholar - [44]K.C.P. Want and J.P. Zaniewski, “20/30 hindsight: the new pavement optimization,”
*Interfaces***26**pp. 77–87, 1996.CrossRefGoogle Scholar

## Copyright information

© Springer Science+Business Media New York 2002