Blackwell optimality in the class of stationary policies in Markov decision chains with a Borel state space and unbounded rewards

Hordijk, Arie; Yushkevich, Alexander A.

doi:10.1007/s001860050011

Blackwell optimality in the class of stationary policies in Markov decision chains with a Borel state space and unbounded rewards

Published: March 1999

Volume 49, pages 1–39, (1999)
Cite this article

Mathematical Methods of Operations Research Aims and scope Submit manuscript

Arie Hordijk¹ &
Alexander A. Yushkevich²

88 Accesses
1 Citation
Explore all metrics

Abstract.

This paper is the first part of a study of Blackwell optimal policies in Markov decision chains with a Borel state space and unbounded rewards. We prove here the existence of deterministic stationary policies which are Blackwell optimal in the class of all, in general randomized, stationary policies. We establish also a lexicographical policy improvement algorithm leading to Blackwell optimal policies and the relation between such policies and the Blackwell optimality equation. Our technique is a combination of the weighted norms approach developed in Dekker and Hordijk (1988) for countable models with unbounded rewards and of the weak-strong topology approach used in Yushkevich (1997a) for Borel models with bounded rewards.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sample-Path Optimality in Average Markov Decision Chains Under a Double Lyapunov Function Condition

Average criteria in denumerable semi-Markov decision chains under risk-aversion

Article 21 August 2023

Finite Markov Chains and Markov Decision Processes

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, Leiden University, 2300 RA Leiden, The Netherlands (e-mail: hordijk@wi.leidenuniv.nl), , , , , , NL
Arie Hordijk
Department of Mathematics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA (e-mail: aayushke@email.uncc.edu), , , , , , US
Alexander A. Yushkevich

Authors

Arie Hordijk
View author publications
You can also search for this author in PubMed Google Scholar
Alexander A. Yushkevich
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Manuscript received: March 1998/final version received: July 1998

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hordijk, A., Yushkevich, A. Blackwell optimality in the class of stationary policies in Markov decision chains with a Borel state space and unbounded rewards. Mathematical Methods of OR 49, 1–39 (1999). https://doi.org/10.1007/s001860050011

Download citation

Issue Date: March 1999
DOI: https://doi.org/10.1007/s001860050011

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Blackwell optimality in the class of stationary policies in Markov decision chains with a Borel state space and unbounded rewards

Abstract.

Access this article

Similar content being viewed by others

Sample-Path Optimality in Average Markov Decision Chains Under a Double Lyapunov Function Condition

Average criteria in denumerable semi-Markov decision chains under risk-aversion

Finite Markov Chains and Markov Decision Processes

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Navigation

Blackwell optimality in the class of stationary policies in Markov decision chains with a Borel state space and unbounded rewards

Abstract.

Access this article

Similar content being viewed by others

Sample-Path Optimality in Average Markov Decision Chains Under a Double Lyapunov Function Condition

Average criteria in denumerable semi-Markov decision chains under risk-aversion

Finite Markov Chains and Markov Decision Processes

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation