Abstract
We consider the ergodic control of a Markov chain on a countable state space with a compact action space in presence of finitely many (say,m) ergodic constraints. Under a condition on the cost functions that penalizes instability, the existence of an optimal stable stationary strategy randomized at a maximum ofm states is established using convex analytic arguments.
Similar content being viewed by others
References
Altman E, Shwartz A 1990 Sensitivity of constrained Markov decision processes, EE Pub. No. 741, Dept. of Electrical Eng., Technion, Haifa, Israel
Beutler F J, Ross K W 1985 Optimal policies for controlled Markov chains with a constraint,J. Math. Anal. Appl. 112: 236–252
Billingsley P 1968Convergence of probability measures (New York: Wiley)
Borkar V S 1989 Control of Markov chains with long-run average cost criterion: the dynamic programmin equations.SIAM J. Control Optim. 27: 642–657
Borkar V S 1991 Topics in controlled Markov chains,Pitman research notes in mathematics (Harlow: Longman) Chap. 7
Dubins L 1962 On extreme points of convex sets,J. Math. Anal. Appl. 5: 237–244
Hordijk A, Kallenberg L C M 1984 Constrained undiscounted stochastic dynamic programming.Math. Oper. Res. 9: 276–289
Luenberger D 1967Optimization by vector space methods (New York: Wiley)
Phelps R 1966Lectures on Choquet’s theorem (New York: Van Nostrand)
Ross K W 1989 Randomized and past-dependent policies for Markov decision processes with multiple constraints.Oper. Res. 37: 474–477
Schwartz L 1961Disintegration of measures (Bombay: Tata Institute of Fundamental Research)
Witsenhausen 1980 Some aspects of convexity useful in information theory.IEEE Trans. Inf. Theory IT-26: 265–271
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Borkar, V.S. Controlled Markov chains with constraints. Sadhana 15, 405–413 (1990). https://doi.org/10.1007/BF02811335
Issue Date:
DOI: https://doi.org/10.1007/BF02811335