Abstract
Linear Programming is known to be an important and useful tool for solving Markov Decision Processes (MDP). Its derivation relies on the Dynamic Programming approach, which also serves to solve MDP. However, for Markov Decision Processes with several constraints the only available methods are based on Linear Programs. The aim of this paper is to investigate some aspects of such Linear Programs, related to multi-chain MDPs. We first present a stochastic interpretation of the decision variables that appear in the Linear Programs available in the literature. We then show for the multi-constrained Markov Decision Process that the Linear Program suggested in [9] can be obtained from an equivalent unconstrained Lagrange formulation of the control problem. This shows the connection between the Linear Program approach and the Lagrange approach, that was previously used only for the case of a single constraint [3, 14, 15].
Similar content being viewed by others
References
Altman E, Schwartz A (1991) Markov decision problems and state-action frequencies. SIAM J Control and Optimization 29/4:786–809
Altman E (1994) Denumerable constrained Markov decision problems and finite approximations. Math of OR 19:169–191
Beutler FJ, Ross KW (1985) Optimal policies for controlled Markov chains with a constraint. Math Anal Appl 112:236–252
Borkar VS (1988) A convex analytic approach to Markov decision processes. Probab Th Rel Fields 78:583–602
Borkar VS (1991) Topics in controlled Markov chains. Pitman
Dembo A, Zeitouni O (1993) Large deviations techniques and applications. Jones and Bartlett
Derman C (1970) Finite state markovian decision processes. Academic Press
Hordijk A, Kallenberg LCM (1979) Linear programing and Markov decision chains. Management Science 25/4:352–362
Hordijk A, Kallenberg LCM (1984) Constrained undiscounted stochastic dynamic programming. Math of OR 9:277–298
Kallenberg LCM (1983) Linear programming and finite markovian control problems. Math Centre Tracts 148 Amsterdam
Luenberger DG (1968) Optimization by vector space methods. John Wiley
Ross K, Varadarajan R (1991) Multichain Markov Decision Processes with a Sample Path Constraint: A Decomposition Approach. MOR 16/1:195–207
Seneta E (1981) Non-negative martices and markov chains. Springer-Verlag
Sennott LI (1991) Constrained discounted Markov decision chains. Probability in the Engineering and Informational Sciences 5:463–475
Sennott LI (1993) Constrained average cost markov decision chains. Probability in the Engineering and Informational Sciences 7:69–83
Spieksma F (1990) Geometrically ergodic markov chains and the optimal control of queues. Ph D thesis Leiden
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Altman, E., Spieksma, F. The Linear Program approach in multi-chain Markov Decision Processes revisited. ZOR - Methods and Models of Operations Research 42, 169–188 (1995). https://doi.org/10.1007/BF01415752
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF01415752