Skip to main content
Log in

Nonstationary denumerable state Markov decision processes – with average variance criterion

  • Published:
Mathematical Methods of Operations Research Aims and scope Submit manuscript

Abstract

. In this paper, we consider the nonstationary Markov decision processes (MDP, for short) with average variance criterion on a countable state space, finite action spaces and bounded one-step rewards. From the optimality equations which are provided in this paper, we translate the average variance criterion into a new average expected cost criterion. Then we prove that there exists a Markov policy, which is optimal in an original average expected reward criterion, that minimizies the average variance in the class of optimal policies for the original average expected reward criterion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Additional information

Manuscript received: October 1997/final version received: March 1998

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, X. Nonstationary denumerable state Markov decision processes – with average variance criterion. Mathematical Methods of OR 49, 87–96 (1999). https://doi.org/10.1007/PL00020908

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/PL00020908

Navigation