The Markov decision process is studied under the maximization of the probability that total discounted rewards exceed a target level. We focus on and study the dynamic programing equations of the model. We give various properties of the optimal return operator and, for the infinite planning-horizon model, we characterize the optimal value function as a maximal fixed point of the previous operator. Various turnpike results relating the finite and infinite-horizon models are also given.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Dubins, L. E., andSavage, L. J.,Inequalities for Stochastic Processes: How to Gamble If You Must, Dover, New York, New York, 1976.
Simon, H. A.,Models of Man, Wiley, New York, New York, 1957.
Rendelman, R. J., andMcEnally, R. W.,Assessing the Costs of Portfolio Insurance, Financial Analyst Journal, pp. 27–37, May–June 1987.
Lau, H. S.,The Newsboy Problem Alternative Optimizing Criteria, Journal of the Operational Research Society, Vol. 26, pp. 525–535, 1980.
Kumarsswamy, S., andSankarasubramanian, E.,A Note on Optimal Ordering Quantity to Realize a Predetermined Level of Profit, Management Science, Vol. 29, pp. 512–513, 1983.
Kao, E. P.,A Preference Order Dynamic Program for a Stochastic Traveling Salesman Problem, Operations Research, Vol. 26, pp. 1033–1045, 1978.
Henig, M. I.,Target and Percentile Criteria in Dynamic Programming with Deterministic Transitions and Stochastic Rewards, Working Paper, Department of Business Administration, University of Illinois at Urbana, 1984.
Filar, J. E.,Percentiles and Markov Decision Processes, Operations Research Letters, Vol. 2, pp. 13–15, 1983.
Heyman, D., andSobel, M. J.,Stochastic Models in Operations Research,Vol. 2, McGraw-Hill, New York, New York, 1984.
Sobel, M. J.,The Variance of Discounted Markov Decision Processes, Journal of Applied Probability, Vol. 19, pp. 794–802, 1982.
Chung, M. J., andSobel, M. J.,Discounted MDPs: Distribution Functions and Exponential Utility Maximization, SIAM Journal on Control and Optimization, Vol. 25, pp. 49–62, 1987.
Ross, S. M.,Applied Probability Models with Optimization Applications, Holden-Day, San Francisco, California, 1970.
Royden, H.,Real Analysis, Macmillan, New York, New York, 1968.
Communicated by M. Pachter
About this article
Cite this article
Bouakiz, M., Kebir, Y. Target-level criterion in Markov decision processes. J Optim Theory Appl 86, 1–15 (1995). https://doi.org/10.1007/BF02193458
- Markov decision processes
- target-level criterion
- fixed points
- dynamic programming
- successive approximations