Compactness of the space of non-randomized policies in countable-state sequential decision processes

Chen, Richard C.; Feinberg, Eugene A.

doi:10.1007/s00186-009-0298-1

Compactness of the space of non-randomized policies in countable-state sequential decision processes

Published: 10 January 2010

Volume 71, pages 307–323, (2010)
Cite this article

Mathematical Methods of Operations Research Aims and scope Submit manuscript

Richard C. Chen¹ &
Eugene A. Feinberg²

80 Accesses
2 Citations
Explore all metrics

Abstract

For sequential decision processes with countable state spaces, we prove compactness of the set of strategic measures corresponding to nonrandomized policies. For the Borel state case, this set may not be compact (Piunovskiy, Optimal control of random sequences in problems with constraints. Kluwer, Boston, p. 170, 1997) in spite of compactness of the set of strategic measures corresponding to all policies (Schäl, On dynamic programming: compactness of the space of policies. Stoch Processes Appl 3(4):345–364, 1975b; Balder, On compactness of the space of policies in stochastic dynamic programming. Stoch Processes Appl 32(1):141–150, 1989). We use the compactness result from this paper to show the existence of optimal policies for countable-state constrained optimization of expected discounted and nonpositive rewards, when the optimality is considered within the class of nonrandomized policies. This paper also studies the convergence of a value-iteration algorithm for such constrained problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Strategy Complexity of Finite-Horizon Markov Decision Processes and Simple Stochastic Games

Computing semi-stationary optimal policies for multichain semi-Markov decision processes

Article 14 October 2017

Sample-Path Optimality in Average Markov Decision Chains Under a Double Lyapunov Function Condition

References

Altman E (1999) Constrained Markov decision processes. Chapman & Hall/CRC, Boca Raton, USA
MATH Google Scholar
Balder EJ (1989) On compactness of the space of policies in stochastic dynamic programming. Stoch Process Appl 32(1): 141–150
Article MATH MathSciNet Google Scholar
Billingsley P (1986) Convergence of probability measures. Wiley, New York
Google Scholar
Blackwell D (1967) Positive dynamic programming. In: Proceedings of the 5th Berkeley symposium on mathematical statistics and probability, University of Californua Press, Berkeley, vol 1, pp 415–418
Borkar VS (1991) Topics in controlled Markov chains. Longman Scientific & Technical, Harlow
MATH Google Scholar
Chen RC, Blankenship GL (2004) Dynamic programming equations for discounted constrained stochastic control. IEEE Trans Automat Contr 49(5): 699–709
Article MathSciNet Google Scholar
Chen RC, Feinberg EA (2007) Non-randomized policies for constrained Markov decision processes. Math Meth Oper Res 66(1): 165–179
Article MATH MathSciNet Google Scholar
Dynkin EB, Yushkevich AA (1979) Controlled Markov processes and their applications. Springer, New York
Google Scholar
Feinberg EA (2000) Constrained discounted Markov decision processes and Hamiltonian cycles. Math Oper Res 25(1): 130–140
Article MATH MathSciNet Google Scholar
Feinberg EA, Shwartz A (1996) Constrained discounted dynamic programming. Math Oper Res 21(4): 922–945
Article MATH MathSciNet Google Scholar
Hernandez-Lerma O, Lasserre JB (1996) Discrete-time Markov control processes. Springer, New York
Google Scholar
Kechris AS (1995) Classical descriptive set theory. Springer, New York
MATH Google Scholar
Nikaido H (1968) Convex structures and economic theory. Academic Press, New York
MATH Google Scholar
Nowak AS (1988) On the weak topology in a space of probability measures induced by policies. Bull Polish Acad Sci Math 36(3–4): 181–186
MATH MathSciNet Google Scholar
Piunovskiy AB (1997) Optimal control of random sequences in problems with constraints. Kluwer, Boston
MATH Google Scholar
Royden HL (1968) Real analysis. MacMillan Publishing Co, New York
Google Scholar
Schäl M (1975a) Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal. Z Wahrsch verw Gebiete 32(3): 179–196
Article MATH Google Scholar
Schäl M (1975b) On dynamic programming: compactness of the space of policies. Stoch Process Appl 3(4): 345–364
Article MATH Google Scholar
Schäl M (1979) On dynamic programming and statistical decision theory. Ann Stat 7(2): 432–445
Article MATH Google Scholar
Strauch R (1966) Negative dynamic programming. Ann Math Stat 37(4): 871–890
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Naval Research Laboratory, Code 5341, 4555 Overlook Ave. SW, Washington, DC, 20375, USA
Richard C. Chen
Department of Applied Mathematics and Statistics, State University of New York, Stony Brook, NY, 11794-3600, USA
Eugene A. Feinberg

Authors

Richard C. Chen
View author publications
You can also search for this author in PubMed Google Scholar
Eugene A. Feinberg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Richard C. Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, R.C., Feinberg, E.A. Compactness of the space of non-randomized policies in countable-state sequential decision processes. Math Meth Oper Res 71, 307–323 (2010). https://doi.org/10.1007/s00186-009-0298-1

Download citation

Received: 25 November 2008
Accepted: 24 November 2009
Published: 10 January 2010
Issue Date: April 2010
DOI: https://doi.org/10.1007/s00186-009-0298-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Compactness of the space of non-randomized policies in countable-state sequential decision processes

Abstract

Access this article

Similar content being viewed by others

Strategy Complexity of Finite-Horizon Markov Decision Processes and Simple Stochastic Games

Computing semi-stationary optimal policies for multichain semi-Markov decision processes

Sample-Path Optimality in Average Markov Decision Chains Under a Double Lyapunov Function Condition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Compactness of the space of non-randomized policies in countable-state sequential decision processes

Abstract

Access this article

Similar content being viewed by others

Strategy Complexity of Finite-Horizon Markov Decision Processes and Simple Stochastic Games

Computing semi-stationary optimal policies for multichain semi-Markov decision processes

Sample-Path Optimality in Average Markov Decision Chains Under a Double Lyapunov Function Condition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation