Skip to main content

Markovian Decision Processes with Finite Transition Law

  • Chapter
  • First Online:
Dynamic Optimization

Part of the book series: Universitext ((UTX))

  • 3047 Accesses

Abstract

Firstly we introduce MDPs with finite state spaces, prove the reward iteration and derive the basic solution techniques: value iteration and optimality criterion. Then MDPs with finite transition law are considered. There the set of reachable states is finite.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Almudevar, A. (2014). Approximate iterative algorithms. Leiden: CRC Press/Balkema.

    Book  MATH  Google Scholar 

  • Bellman, R. (1957). Dynamic programming. Princeton: Princeton University Press.

    MATH  Google Scholar 

  • Berry, D. A., & Fristedt, B. (1985). Bandit problems. London: Chapman & Hall.

    Book  MATH  Google Scholar 

  • Bertsekas, D. P. (1976). Dynamic programming and stochastic control. New York: Academic Press.

    MATH  Google Scholar 

  • Bertsekas, D. P. (1995). Dynamic programming and optimal control. Belmont: Athena Scientific.

    MATH  Google Scholar 

  • Bertsekas, D. P., & Shreve, S. E. (1978). Stochastic optimal control. New York: Academic Press.

    MATH  Google Scholar 

  • Blackwell, D. (1962). Discrete dynamic programming. The Annals of Mathematical Statistics, 33, 719–726.

    Article  MathSciNet  MATH  Google Scholar 

  • Blackwell, D. (1965). Discounted dynamic programming. The Annals of Mathematical Statistics, 36, 226–235.

    Article  MathSciNet  MATH  Google Scholar 

  • Chang, H. S., Fu, M. C., Hu, J., & Marcus, S. I. (2007). Simulation-based algorithms for Markov decision processes. London: Springer.

    Book  MATH  Google Scholar 

  • Dvoretzky, A., Kiefer, J., & Wolfowitz, J. (1952a). The inventory problem. I. Case of known distributions of demand. Econometrica, 20, 187–222.

    Article  MathSciNet  MATH  Google Scholar 

  • Dvoretzky, A., Kiefer, J., & Wolfowitz, J. (1952b). The inventory problem. II. Case of unknown distributions of demand. Econometrica, 20, 450–466.

    Article  MathSciNet  MATH  Google Scholar 

  • Dynkin, E. B. (1965). Markov processes (Vols. I, II). New York: Academic Press.

    Book  MATH  Google Scholar 

  • Dynkin, E. B., & Yushkevich, A. A. (1979). Controlled Markov processes. Berlin: Springer.

    Book  MATH  Google Scholar 

  • Hernández-Lerma, O. (1989). Adaptive Markov control processes. New York: Springer.

    Book  MATH  Google Scholar 

  • Heyman, D., & Sobel, M. (1984). Stochastic models in operations research: Stochastic optimization. New York: McGraw-Hill.

    MATH  Google Scholar 

  • Hinderer, K. (1970). Foundations of non-stationary dynamic programming with discrete time parameter (Lecture Notes in Operations Research and Mathematical Systems, Vol. 33). Berlin: Springer.

    Book  MATH  Google Scholar 

  • Hinderer, K. (1976). Estimates for finite-stage dynamic programs. Journal of Mathematical Analysis and Applications, 55, 207–238.

    Article  MathSciNet  MATH  Google Scholar 

  • Hinderer, K., & Hübner, G. (1977). An improvement of J. F. Shapiro’s turnpike theorem for the horizon of finite stage discrete dynamic programs. In Transactions of the Seventh Prague Conference on Information Theory, Statistical Decision Functions, Random Processes (Vol. A, pp. 245–255). Dordrecht: Reidel.

    Google Scholar 

  • Hordijk, A. (1974). Dynamic programming and Markov potential theory. Amsterdam: Mathematisch Centrum.

    MATH  Google Scholar 

  • Howard, G. T., & Nemhauser, G. L. (1968). Optimal capacity expansion. Naval Research Logistics Quarterly, 15, 535–550.

    Article  MathSciNet  MATH  Google Scholar 

  • Howard, R. A. (1960). Dynamic programming and Markov processes. Cambridge: Technology Press of Massachusetts Institute of Technology.

    MATH  Google Scholar 

  • Hübner, G. (1980). Bounds and good policies in stationary finite-stage Markovian decision problems. Advances in Applied Probability, 12, 154–173.

    MathSciNet  MATH  Google Scholar 

  • Karlin, S. (1955). The structure of dynamic programming models. Naval Research Logistics Quarterly, 2, 285–294 (1956).

    Article  Google Scholar 

  • Powell, W. B. (2007). Approximate dynamic programming. New York: Wiley.

    Book  MATH  Google Scholar 

  • Puterman, M. L. (1994). Markov decision processes: Discrete stochastic dynamic programming. New York: Wiley.

    Book  MATH  Google Scholar 

  • Ross, S. M. (1970). Applied probability models with optimization applications. San Francisco: Holden-Day.

    MATH  Google Scholar 

  • Ross, S. M. (1983). Introduction to stochastic dynamic programming (Probability and Mathematical Statistics). New York: Academic Press.

    Google Scholar 

  • Whittle, P. (1982). Optimization over time (Vol. I). Chichester: Wiley.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this chapter

Cite this chapter

Hinderer, K., Rieder, U., Stieglitz, M. (2016). Markovian Decision Processes with Finite Transition Law. In: Dynamic Optimization. Universitext. Springer, Cham. https://doi.org/10.1007/978-3-319-48814-1_12

Download citation

Publish with us

Policies and ethics