Markovian Decision Processes with Finite Transition Law

Hinderer, Karl; Rieder, Ulrich; Stieglitz, Michael

doi:10.1007/978-3-319-48814-1_12

Karl Hinderer¹¹,
Ulrich Rieder¹² &
Michael Stieglitz¹¹

Part of the book series: Universitext ((UTX))

3047 Accesses

Abstract

Firstly we introduce MDPs with finite state spaces, prove the reward iteration and derive the basic solution techniques: value iteration and optimality criterion. Then MDPs with finite transition law are considered. There the set of reachable states is finite.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Almudevar, A. (2014). Approximate iterative algorithms. Leiden: CRC Press/Balkema.
Book MATH Google Scholar
Bellman, R. (1957). Dynamic programming. Princeton: Princeton University Press.
MATH Google Scholar
Berry, D. A., & Fristedt, B. (1985). Bandit problems. London: Chapman & Hall.
Book MATH Google Scholar
Bertsekas, D. P. (1976). Dynamic programming and stochastic control. New York: Academic Press.
MATH Google Scholar
Bertsekas, D. P. (1995). Dynamic programming and optimal control. Belmont: Athena Scientific.
MATH Google Scholar
Bertsekas, D. P., & Shreve, S. E. (1978). Stochastic optimal control. New York: Academic Press.
MATH Google Scholar
Blackwell, D. (1962). Discrete dynamic programming. The Annals of Mathematical Statistics, 33, 719–726.
Article MathSciNet MATH Google Scholar
Blackwell, D. (1965). Discounted dynamic programming. The Annals of Mathematical Statistics, 36, 226–235.
Article MathSciNet MATH Google Scholar
Chang, H. S., Fu, M. C., Hu, J., & Marcus, S. I. (2007). Simulation-based algorithms for Markov decision processes. London: Springer.
Book MATH Google Scholar
Dvoretzky, A., Kiefer, J., & Wolfowitz, J. (1952a). The inventory problem. I. Case of known distributions of demand. Econometrica, 20, 187–222.
Article MathSciNet MATH Google Scholar
Dvoretzky, A., Kiefer, J., & Wolfowitz, J. (1952b). The inventory problem. II. Case of unknown distributions of demand. Econometrica, 20, 450–466.
Article MathSciNet MATH Google Scholar
Dynkin, E. B. (1965). Markov processes (Vols. I, II). New York: Academic Press.
Book MATH Google Scholar
Dynkin, E. B., & Yushkevich, A. A. (1979). Controlled Markov processes. Berlin: Springer.
Book MATH Google Scholar
Hernández-Lerma, O. (1989). Adaptive Markov control processes. New York: Springer.
Book MATH Google Scholar
Heyman, D., & Sobel, M. (1984). Stochastic models in operations research: Stochastic optimization. New York: McGraw-Hill.
MATH Google Scholar
Hinderer, K. (1970). Foundations of non-stationary dynamic programming with discrete time parameter (Lecture Notes in Operations Research and Mathematical Systems, Vol. 33). Berlin: Springer.
Book MATH Google Scholar
Hinderer, K. (1976). Estimates for finite-stage dynamic programs. Journal of Mathematical Analysis and Applications, 55, 207–238.
Article MathSciNet MATH Google Scholar
Hinderer, K., & Hübner, G. (1977). An improvement of J. F. Shapiro’s turnpike theorem for the horizon of finite stage discrete dynamic programs. In Transactions of the Seventh Prague Conference on Information Theory, Statistical Decision Functions, Random Processes (Vol. A, pp. 245–255). Dordrecht: Reidel.
Google Scholar
Hordijk, A. (1974). Dynamic programming and Markov potential theory. Amsterdam: Mathematisch Centrum.
MATH Google Scholar
Howard, G. T., & Nemhauser, G. L. (1968). Optimal capacity expansion. Naval Research Logistics Quarterly, 15, 535–550.
Article MathSciNet MATH Google Scholar
Howard, R. A. (1960). Dynamic programming and Markov processes. Cambridge: Technology Press of Massachusetts Institute of Technology.
MATH Google Scholar
Hübner, G. (1980). Bounds and good policies in stationary finite-stage Markovian decision problems. Advances in Applied Probability, 12, 154–173.
MathSciNet MATH Google Scholar
Karlin, S. (1955). The structure of dynamic programming models. Naval Research Logistics Quarterly, 2, 285–294 (1956).
Article Google Scholar
Powell, W. B. (2007). Approximate dynamic programming. New York: Wiley.
Book MATH Google Scholar
Puterman, M. L. (1994). Markov decision processes: Discrete stochastic dynamic programming. New York: Wiley.
Book MATH Google Scholar
Ross, S. M. (1970). Applied probability models with optimization applications. San Francisco: Holden-Day.
MATH Google Scholar
Ross, S. M. (1983). Introduction to stochastic dynamic programming (Probability and Mathematical Statistics). New York: Academic Press.
Google Scholar
Whittle, P. (1982). Optimization over time (Vol. I). Chichester: Wiley.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Karlsruher Institut für Technologie (KIT), Karlsruhe, Germany
Karl Hinderer & Michael Stieglitz
University of Ulm, Ulm, Germany
Ulrich Rieder

Authors

Karl Hinderer
View author publications
You can also search for this author in PubMed Google Scholar
Ulrich Rieder
View author publications
You can also search for this author in PubMed Google Scholar
Michael Stieglitz
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hinderer, K., Rieder, U., Stieglitz, M. (2016). Markovian Decision Processes with Finite Transition Law. In: Dynamic Optimization. Universitext. Springer, Cham. https://doi.org/10.1007/978-3-319-48814-1_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-48814-1_12
Published: 13 January 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48813-4
Online ISBN: 978-3-319-48814-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics