Skip to main content
Log in

Markov renewal decision processes with finite horizon

  • Theoretical Papers
  • Published:
Operations-Research-Spektrum Aims and scope Submit manuscript

Summary

We investigate Markov renewal decision processes with finite horizon, countable state space, general action space and unbounded rewards. Under rather weak restrictions we derive the optimality equation and state conditions ensuring the convergence of successive approximations and the existence of optimal stationary policies. Strengthening the conditions we prove uniqueness of the solution of the optimality equation. Finally we discuss some numerical aspects including extrapolations using an equivalent optimality equation.

Zusammenfassung

Wir untersuchen Semi-Markoffsche Entscheidungsprozesse mit endlichem Horizont, abzählbarem Zustandsraum, allgemeinem Aktionenraum und unbeschränkten Erträgen. Unter schwachen Voraussetzungen leiten wir die Optimalitätsgleichung her und geben hinreichende Bedingungen für die Konvergenz der sukzessiven Approximation und die Existenz optimaler stationärer Politiken. Unter schärferen Voraussetzungen zeigen wir die Eindeutigkeit der Lösung der Optimalitätsgleichung. Schließlich diskutieren wir einige numerische Aspekte einschließlich einer Extrapolation basierend auf einer äquivalenten Optimalitätsgleichung.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Feller W (1971) An introduction to probability theory and its applications, vol. 2, 2nd ed. J Wiley, New York

    Google Scholar 

  2. Hinderer K (1971) Instationäre dynamische Optimierung bei schwachen Voraussetzungen über die Gewinnfunktionen. Abh Math Sem Univ Hamburg 36:208–223

    Article  Google Scholar 

  3. Hinderer K (1970) Foundations of non-stationary dynamic programming with discrete time parameter. Springer, Berlin Heidelberg New York

    Book  Google Scholar 

  4. Hinderer K (1978) On approximate solutions of finite-stage dynamic programs. In: Puterman ML (ed) Dynamic Programming and its Applications, Proc. Internat. Conference on Dynamic Programming, Vancouver 1977. Academic Press, New York, pp 289–317

    Chapter  Google Scholar 

  5. Hinderer K, Hübner G (1977) On exact and approximate solutions of unstructured finite-stage dynamic programs. Proc. Advanced Seminar on Markov Decision Theory, Amsterdam 1976. Math Centre Tracts 93:57–76

    Google Scholar 

  6. Jewell WS (1963) Markov-renewal programming. I: Formulation, finite return models. II: Infinite return models, example. Oper Res 11:938–971

    Article  Google Scholar 

  7. Lembersky MR (1974) On maximal rewards andε-optimal policies in continuous time Markov decision chains. Ann Statist 2:159–169

    Article  Google Scholar 

  8. Lembersky MR (1974) Preferred rules in continuous time Markov decision processes. Manage Sci 21:348–357

    Article  Google Scholar 

  9. Porteus E (1975) Bounds and transformations for discounted finite Markov decision chains. Oper Res 23:761–784

    Article  Google Scholar 

  10. Rieder U (1976) On dynamic programming with unbounded reward functions. Report, Inst. f. Math. Stochastik, University of Hamburg

  11. Schäl M (1975) Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal. Z Wahrsch Verw. Gebiete 32:179–196

    Article  Google Scholar 

  12. Schellhaas H (1974) Zur Extrapolation in Markoffschen Entscheidungsmodellen mit Diskontierung. Z Oper Res 18:91–104

    Google Scholar 

  13. Schellhaas H (1979) Über Semi-Markoffsche Entscheidungsprozesse mit endlichem Horizont. Proc. in Operat. Res. Vol. 8. Gaede KW et al. (eds) Physica-Verlag, Würzburg Wien, pp 122–129

    Google Scholar 

  14. Stidham S On the convergence of successive approximations in dynamic programming with non-zero terminal reward. NCSU Technical report No. 78-9

  15. Waldman K-H (1978) A natural extension of the MacQueen extrapolation. Preprint Nr. 436, Fachbereich Mathematik, Technische Hochschule Darmstadt

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schellhaas, H. Markov renewal decision processes with finite horizon. OR Spektrum 2, 33–40 (1980). https://doi.org/10.1007/BF01720156

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01720156

Keywords

Navigation