, Volume 56, Issue 1, pp 1-29

Mean, variance, and probabilistic criteria in finite Markov decision processes: A review

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Abstract

This paper is a survey of papers which make use of nonstandard Markov decision process criteria (i.e., those which do not seek simply to optimize expected returns per unit time or expected discounted return). It covers infinite-horizon nondiscounted formulations, infinite-horizon discounted formulations, and finite-horizon formulations. For problem formulations in terms solely of the probabilities of being in each state and taking each action, policy equivalence results are given which allow policies to be restricted to the class of Markov policies or to the randomizations of deterministic Markov policies. For problems which cannot be stated in such terms, in terms of the primitive state setI, formulations involving a redefinition of the states are examined.

Communicated by P. L. Yu
The author would like to thank two referees for a very thorough and helpful referceing of the original article and for the extra references (Refs. 47–52) now added to the original reference list.