Skip to main content
Log in

Linear programming formulation of MDPs in countable state space: The multichain case

  • Articles
  • Published:
Zeitschrift für Operations Research Aims and scope Submit manuscript

Abstract

We present an Linear Programming formulation of MDPs with countable state and action spaces and no unichain assumption. This is an extension of the Hordijk and Kallenberg (1979) formulation in finite state and action spaces. We provide sufficient conditions for both existence of optimal solutions to the primal LP program and absence of duality gap. Then, existence of a (possibly randomized) average optimal policy is also guaranteed. Existence of a stationary average optimal deterministic policy is also investigated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Altman E, Shwartz A (1991) Markov decision problems and state-action frequencies. SIAM J Contr Opt 39:786–809

    Google Scholar 

  • Anderson EJ, Nash P (1987) Linear programming in infinite dimensional spaces. Wiley, Chichester

    Google Scholar 

  • Borkar V (1988) A convex analytic approach to Markov decision processes. Prob Th Rel Fields 78:583–602

    Google Scholar 

  • DeGhellinck GT (1960) Les problèmes de décisions séquentielles. Cah Cent Etud Rech Oper 2:161–179

    Google Scholar 

  • Denardo EV, Fox BL (1968) Multichain Markov renewal programs. SIAM J Appl Math 16:468–487

    Google Scholar 

  • Denardo EV (1970) On linear programming in a Markov decision problem. Manag Sci 16:281–288

    Google Scholar 

  • D'Epenoux F (1960) Sur un probleme de production et de stockage dans l'aleatoire. Rev Fr Rech Oper 14:3–16

    Google Scholar 

  • Derman C (1970) Finite state Markovian decision processes. Academic Press, New-York

    Google Scholar 

  • Heilmann WR (1978) Solving stochastic dynamic programming by linear programming — An annoted bibliograhy. Zeit Oper Res 22:43–53

    Google Scholar 

  • Hernandez-Lerma O, Lasserre JB (1993) Linear programming and average optimality of Markov control processes on borel spaces — unbounded costs. SIAM J Contr Opt to appear

  • Hordijk A, Kallenberg LCM (1979) Linear programming and Markov decision chains. Manag Sc 25:352–362

    Google Scholar 

  • Kallenberg LCM (1983) Linear programming and finite Markovian control problems. Mathematical Centre Tracts 148, Mathematical Centre, Amsterdam

    Google Scholar 

  • Kurano M (1989) The existence of a minimum pair of state and policy for Markov decision processes under the hypothesis of Doeblin. SIAM J Contr Opt 27:296–307

    Google Scholar 

  • Lasserre JB (1993) Average optimal policies and linear programming in countable state Markov decision processes. J Math Anal Appl to appear

  • Manne AS (1960) Linear programming and sequential decisions. Manag Sci 6:259–267

    Google Scholar 

  • Spieksma F (1990) Geometrically ergodic Markov chains and the optimal control of queues. PhD Thesis. University of Leiden

  • Yamada K (1975) Duality theorem in Markovian decision problems. J Math Anal Appl 50:579–595

    Google Scholar 

  • Yosida K (1978) Functional analysis. 5th Ed., Springer-Verlag, Berlin

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hordijk, A., Lasserre, J.B. Linear programming formulation of MDPs in countable state space: The multichain case. ZOR - Methods and Models of Operations Research 40, 91–108 (1994). https://doi.org/10.1007/BF01414031

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01414031

Key words

Navigation