Skip to main content
Log in

On Markovian decision programming with recursive reward functions

  • Methodology
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

In this paper, the infinite horizon Markovian decision programming with recursive reward functions is discussed. We show that Bellman's optimal principle is applicable for our model. Then, a sufficient and necessary condition for a policy to be optimal is given. For the stationary case, an iteration algorithm for finding a stationary optimal policy is designed. The algorithm is a generalization of Howard's [7] and Iwamoto's [3] algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. N. Furukawa and S. Iwamoto, Markovian decision processes with recursive reward functions, Bull. Math. Statist. 15, 3–4(1973)79–91.

    Google Scholar 

  2. N. Furukawa and S. Iwamoto, Correction to “Markovian decision processes with recursive reward functions”, Bull. Math. Statist. 16, 1–2(1974)127.

    Google Scholar 

  3. S. Iwamoto, Discrete dynamic programming with a recursive additive system, Bull. Math. Statist. 16, 1–2(1974)49–66.

    Google Scholar 

  4. N. Furukawa and S. Iwamoto, Dynamic programming on recursive reward systems, Bull. Math. Statist. 17, 1–2(1976)103–126.

    Google Scholar 

  5. Dong Zeqing and Liu Ke, Structure of optimal policies for discounted Markovian decision programming. J. Math. Res. Exposition 6, 3(1986)125–134, in Chinese.

    Google Scholar 

  6. D. Blackwell, Discrete dynamic programming, Ann. Math. Statist. 33(1962)719–726.

    Google Scholar 

  7. R.A. Howard,Dynamic Programming and Markov Processes (Wiley, New York, 1960).

    Google Scholar 

  8. Dong Zeqing, Lecture on Markovian decision programming, Institute of Applied Mathematics, Academia Sinica, Beijing, Mimeograph (1985), in Chinese.

  9. Dong Zeqing and Zhang Sheng, On the properties of ε(≥0) optimal policies in the discounted unbounded return model, Acta Math. Appl. Sinica (English Series) 3, 1(1987)15–25.

    Google Scholar 

  10. Dong Zeqing and Liu Ke, Structure of optimal policies for discounted semi-Markov decision programming with unbounded rewards, Sci. Sinica (Ser. A), 4(1986)337–349.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This research was supported by the National Natural Science Foundation of China.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, J., Liu, K. On Markovian decision programming with recursive reward functions. Ann Oper Res 24, 145–164 (1990). https://doi.org/10.1007/BF02216820

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02216820

Keywords

Navigation