Target-level criterion in Markov decision processes

The Markov decision process is studied under the maximization of the probability that total discounted rewards exceed a target level. We focus on and study the dynamic programing equations of the model. We give various properties of the optimal return operator and, for the infinite planning-horizon model, we characterize the optimal value function as a maximal fixed point of the previous operator. Various turnpike results relating the finite and infinite-horizon models are also given.

Communicated by M. Pachter

Bouakiz, M., Kebir, Y. Target-level criterion in Markov decision processes. J Optim Theory Appl 86, 1–15 (1995). https://doi.org/10.1007/BF02193458

Key Words

  • Markov decision processes
  • target-level criterion
  • fixed points
  • dynamic programming
  • successive approximations