Springer Nature is making Coronavirus research free. View research | View latest news | Sign up for updates

Target-level criterion in Markov decision processes

  • 121 Accesses

  • 26 Citations

Abstract

The Markov decision process is studied under the maximization of the probability that total discounted rewards exceed a target level. We focus on and study the dynamic programing equations of the model. We give various properties of the optimal return operator and, for the infinite planning-horizon model, we characterize the optimal value function as a maximal fixed point of the previous operator. Various turnpike results relating the finite and infinite-horizon models are also given.

This is a preview of subscription content, log in to check access.

References

  1. 1.

    Dubins, L. E., andSavage, L. J.,Inequalities for Stochastic Processes: How to Gamble If You Must, Dover, New York, New York, 1976.

  2. 2.

    Simon, H. A.,Models of Man, Wiley, New York, New York, 1957.

  3. 3.

    Rendelman, R. J., andMcEnally, R. W.,Assessing the Costs of Portfolio Insurance, Financial Analyst Journal, pp. 27–37, May–June 1987.

  4. 4.

    Lau, H. S.,The Newsboy Problem Alternative Optimizing Criteria, Journal of the Operational Research Society, Vol. 26, pp. 525–535, 1980.

  5. 5.

    Kumarsswamy, S., andSankarasubramanian, E.,A Note on Optimal Ordering Quantity to Realize a Predetermined Level of Profit, Management Science, Vol. 29, pp. 512–513, 1983.

  6. 6.

    Kao, E. P.,A Preference Order Dynamic Program for a Stochastic Traveling Salesman Problem, Operations Research, Vol. 26, pp. 1033–1045, 1978.

  7. 7.

    Henig, M. I.,Target and Percentile Criteria in Dynamic Programming with Deterministic Transitions and Stochastic Rewards, Working Paper, Department of Business Administration, University of Illinois at Urbana, 1984.

  8. 8.

    Filar, J. E.,Percentiles and Markov Decision Processes, Operations Research Letters, Vol. 2, pp. 13–15, 1983.

  9. 9.

    Heyman, D., andSobel, M. J.,Stochastic Models in Operations Research,Vol. 2, McGraw-Hill, New York, New York, 1984.

  10. 10.

    Sobel, M. J.,The Variance of Discounted Markov Decision Processes, Journal of Applied Probability, Vol. 19, pp. 794–802, 1982.

  11. 11.

    Chung, M. J., andSobel, M. J.,Discounted MDPs: Distribution Functions and Exponential Utility Maximization, SIAM Journal on Control and Optimization, Vol. 25, pp. 49–62, 1987.

  12. 12.

    Ross, S. M.,Applied Probability Models with Optimization Applications, Holden-Day, San Francisco, California, 1970.

  13. 13.

    Royden, H.,Real Analysis, Macmillan, New York, New York, 1968.

Download references

Author information

Additional information

Communicated by M. Pachter

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Bouakiz, M., Kebir, Y. Target-level criterion in Markov decision processes. J Optim Theory Appl 86, 1–15 (1995). https://doi.org/10.1007/BF02193458

Download citation

Key Words

  • Markov decision processes
  • target-level criterion
  • fixed points
  • dynamic programming
  • successive approximations