Skip to main content
Log in

Blackwell optimality in the class of stationary policies in Markov decision chains with a Borel state space and unbounded rewards

  • Published:
Mathematical Methods of Operations Research Aims and scope Submit manuscript

Abstract.

This paper is the first part of a study of Blackwell optimal policies in Markov decision chains with a Borel state space and unbounded rewards. We prove here the existence of deterministic stationary policies which are Blackwell optimal in the class of all, in general randomized, stationary policies. We establish also a lexicographical policy improvement algorithm leading to Blackwell optimal policies and the relation between such policies and the Blackwell optimality equation. Our technique is a combination of the weighted norms approach developed in Dekker and Hordijk (1988) for countable models with unbounded rewards and of the weak-strong topology approach used in Yushkevich (1997a) for Borel models with bounded rewards.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Additional information

Manuscript received: March 1998/final version received: July 1998

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hordijk, A., Yushkevich, A. Blackwell optimality in the class of stationary policies in Markov decision chains with a Borel state space and unbounded rewards. Mathematical Methods of OR 49, 1–39 (1999). https://doi.org/10.1007/s001860050011

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s001860050011

Navigation