Experiment Design and Markov Decision Processes

Dimitrakakis, Christos; Ortner, Ronald

doi:10.1007/978-3-031-07614-5_6

Christos Dimitrakakis⁵ &
Ronald Ortner⁶

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 223))

556 Accesses

Abstract

This chapter introduces the very general formalism of Markov decision processes (MDPs) that allows representation of various sequential decision making problems. Thus a Markov decision process can be used to model stochastic path problems, stopping problems as well as problems in reinforcement learning, experiment design, and control.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Thus, the result is weakly polynomial complexity, due to the dependence on the input size description.

References

Chernoff, H.: Sequential design of experiments. Ann. Math. Stat. 30(3), 755–770 (1959)
Article MathSciNet MATH Google Scholar
Chernoff, H.: Sequential models for clinical trials. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 4, pp. 805–812. University of California Press (1966)
Google Scholar
Puterman, M.L.: Markov Decision Processes Discrete Stochastic Dynamic Programming. Wiley, New Jersey, US (1994)
Book MATH Google Scholar
Tseng, P.: Solving H-horizon, stationary Markov decision problems in time proportional to log(H). Oper. Res. Lett. 9(5), 287–297 (1990)
Article MathSciNet MATH Google Scholar
Ye, Y.: The simplex and policy-iteration methods are strongly polynomial for the Markov decision problem with a fixed discount rate. Math. Oper. Res. 36(4), 593–603 (2011)
Article MathSciNet MATH Google Scholar
DeGroot, M.H.: Optimal Statistical Decisions. Wiley (1970)
Google Scholar
O’Gordon Duff, M.: Optimal learning computational procedures for bayes-adaptive markov decision processes. Ph.D. thesis, University of Massachusetts at Amherst (2002)
Google Scholar
Gittins, J.C.: Multi-armed Bandit Allocation Indices. Wiley, New Jersey, US (1989)
MATH Google Scholar
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific (1996)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Informatique, Université de Neuchâtel, Neuchâtel, Switzerland
Christos Dimitrakakis
Department Mathematik und Informationstechnologie, Montanuniversität Leoben, Leoben, Austria
Ronald Ortner

Authors

Christos Dimitrakakis
View author publications
You can also search for this author in PubMed Google Scholar
Ronald Ortner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christos Dimitrakakis .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Dimitrakakis, C., Ortner, R. (2022). Experiment Design and Markov Decision Processes. In: Decision Making Under Uncertainty and Reinforcement Learning. Intelligent Systems Reference Library, vol 223. Springer, Cham. https://doi.org/10.1007/978-3-031-07614-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-07614-5_6
Published: 03 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-07612-1
Online ISBN: 978-3-031-07614-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics