Skip to main content
Log in

Compactness of the space of non-randomized policies in countable-state sequential decision processes

  • Published:
Mathematical Methods of Operations Research Aims and scope Submit manuscript

Abstract

For sequential decision processes with countable state spaces, we prove compactness of the set of strategic measures corresponding to nonrandomized policies. For the Borel state case, this set may not be compact (Piunovskiy, Optimal control of random sequences in problems with constraints. Kluwer, Boston, p. 170, 1997) in spite of compactness of the set of strategic measures corresponding to all policies (Schäl, On dynamic programming: compactness of the space of policies. Stoch Processes Appl 3(4):345–364, 1975b; Balder, On compactness of the space of policies in stochastic dynamic programming. Stoch Processes Appl 32(1):141–150, 1989). We use the compactness result from this paper to show the existence of optimal policies for countable-state constrained optimization of expected discounted and nonpositive rewards, when the optimality is considered within the class of nonrandomized policies. This paper also studies the convergence of a value-iteration algorithm for such constrained problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Altman E (1999) Constrained Markov decision processes. Chapman & Hall/CRC, Boca Raton, USA

    MATH  Google Scholar 

  • Balder EJ (1989) On compactness of the space of policies in stochastic dynamic programming. Stoch Process Appl 32(1): 141–150

    Article  MATH  MathSciNet  Google Scholar 

  • Billingsley P (1986) Convergence of probability measures. Wiley, New York

    Google Scholar 

  • Blackwell D (1967) Positive dynamic programming. In: Proceedings of the 5th Berkeley symposium on mathematical statistics and probability, University of Californua Press, Berkeley, vol 1, pp 415–418

  • Borkar VS (1991) Topics in controlled Markov chains. Longman Scientific & Technical, Harlow

    MATH  Google Scholar 

  • Chen RC, Blankenship GL (2004) Dynamic programming equations for discounted constrained stochastic control. IEEE Trans Automat Contr 49(5): 699–709

    Article  MathSciNet  Google Scholar 

  • Chen RC, Feinberg EA (2007) Non-randomized policies for constrained Markov decision processes. Math Meth Oper Res 66(1): 165–179

    Article  MATH  MathSciNet  Google Scholar 

  • Dynkin EB, Yushkevich AA (1979) Controlled Markov processes and their applications. Springer, New York

    Google Scholar 

  • Feinberg EA (2000) Constrained discounted Markov decision processes and Hamiltonian cycles. Math Oper Res 25(1): 130–140

    Article  MATH  MathSciNet  Google Scholar 

  • Feinberg EA, Shwartz A (1996) Constrained discounted dynamic programming. Math Oper Res 21(4): 922–945

    Article  MATH  MathSciNet  Google Scholar 

  • Hernandez-Lerma O, Lasserre JB (1996) Discrete-time Markov control processes. Springer, New York

    Google Scholar 

  • Kechris AS (1995) Classical descriptive set theory. Springer, New York

    MATH  Google Scholar 

  • Nikaido H (1968) Convex structures and economic theory. Academic Press, New York

    MATH  Google Scholar 

  • Nowak AS (1988) On the weak topology in a space of probability measures induced by policies. Bull Polish Acad Sci Math 36(3–4): 181–186

    MATH  MathSciNet  Google Scholar 

  • Piunovskiy AB (1997) Optimal control of random sequences in problems with constraints. Kluwer, Boston

    MATH  Google Scholar 

  • Royden HL (1968) Real analysis. MacMillan Publishing Co, New York

    Google Scholar 

  • Schäl M (1975a) Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal. Z Wahrsch verw Gebiete 32(3): 179–196

    Article  MATH  Google Scholar 

  • Schäl M (1975b) On dynamic programming: compactness of the space of policies. Stoch Process Appl 3(4): 345–364

    Article  MATH  Google Scholar 

  • Schäl M (1979) On dynamic programming and statistical decision theory. Ann Stat 7(2): 432–445

    Article  MATH  Google Scholar 

  • Strauch R (1966) Negative dynamic programming. Ann Math Stat 37(4): 871–890

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard C. Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, R.C., Feinberg, E.A. Compactness of the space of non-randomized policies in countable-state sequential decision processes. Math Meth Oper Res 71, 307–323 (2010). https://doi.org/10.1007/s00186-009-0298-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00186-009-0298-1

Keywords

Navigation