Advertisement

Zeitschrift für Operations Research

, Volume 40, Issue 1, pp 91–108 | Cite as

Linear programming formulation of MDPs in countable state space: The multichain case

  • Arie Hordijk
  • Jean B. Lasserre
Articles

Abstract

We present an Linear Programming formulation of MDPs with countable state and action spaces and no unichain assumption. This is an extension of the Hordijk and Kallenberg (1979) formulation in finite state and action spaces. We provide sufficient conditions for both existence of optimal solutions to the primal LP program and absence of duality gap. Then, existence of a (possibly randomized) average optimal policy is also guaranteed. Existence of a stationary average optimal deterministic policy is also investigated.

Key words

Markov decision processes countable state space Linear programming duality 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altman E, Shwartz A (1991) Markov decision problems and state-action frequencies. SIAM J Contr Opt 39:786–809Google Scholar
  2. Anderson EJ, Nash P (1987) Linear programming in infinite dimensional spaces. Wiley, ChichesterGoogle Scholar
  3. Borkar V (1988) A convex analytic approach to Markov decision processes. Prob Th Rel Fields 78:583–602Google Scholar
  4. DeGhellinck GT (1960) Les problèmes de décisions séquentielles. Cah Cent Etud Rech Oper 2:161–179Google Scholar
  5. Denardo EV, Fox BL (1968) Multichain Markov renewal programs. SIAM J Appl Math 16:468–487Google Scholar
  6. Denardo EV (1970) On linear programming in a Markov decision problem. Manag Sci 16:281–288Google Scholar
  7. D'Epenoux F (1960) Sur un probleme de production et de stockage dans l'aleatoire. Rev Fr Rech Oper 14:3–16Google Scholar
  8. Derman C (1970) Finite state Markovian decision processes. Academic Press, New-YorkGoogle Scholar
  9. Heilmann WR (1978) Solving stochastic dynamic programming by linear programming — An annoted bibliograhy. Zeit Oper Res 22:43–53Google Scholar
  10. Hernandez-Lerma O, Lasserre JB (1993) Linear programming and average optimality of Markov control processes on borel spaces — unbounded costs. SIAM J Contr Opt to appearGoogle Scholar
  11. Hordijk A, Kallenberg LCM (1979) Linear programming and Markov decision chains. Manag Sc 25:352–362Google Scholar
  12. Kallenberg LCM (1983) Linear programming and finite Markovian control problems. Mathematical Centre Tracts 148, Mathematical Centre, AmsterdamGoogle Scholar
  13. Kurano M (1989) The existence of a minimum pair of state and policy for Markov decision processes under the hypothesis of Doeblin. SIAM J Contr Opt 27:296–307Google Scholar
  14. Lasserre JB (1993) Average optimal policies and linear programming in countable state Markov decision processes. J Math Anal Appl to appearGoogle Scholar
  15. Manne AS (1960) Linear programming and sequential decisions. Manag Sci 6:259–267Google Scholar
  16. Spieksma F (1990) Geometrically ergodic Markov chains and the optimal control of queues. PhD Thesis. University of LeidenGoogle Scholar
  17. Yamada K (1975) Duality theorem in Markovian decision problems. J Math Anal Appl 50:579–595Google Scholar
  18. Yosida K (1978) Functional analysis. 5th Ed., Springer-Verlag, BerlinGoogle Scholar

Copyright information

© Physica-Verlag 1994

Authors and Affiliations

  • Arie Hordijk
    • 1
  • Jean B. Lasserre
    • 2
  1. 1.Dept of Mathematics and Computer ScienceUniversity of LeidenLeiden RA, LeidenThe Netherlands
  2. 2.Laboratoire d'Automatique et d'Analyse des Systèmes du CNRSToulouse CédexFrance

Personalised recommendations