Abstract
This chapter introduces sequential decision problems, in particular Markov decision processes (MDPs). A formal definition of an MDP is given, and the two most common solution techniques are described: value iteration and policy iteration. Then, factored MDPs are described, which provide a representation based on graphical models to solve very large MDPs. An introduction to partially observable MDPs (POMDPs) is also included. The chapter concludes by describing two applications of MDPs: power plant control and service robot task coordination.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This assumes that the defined reward function correctly models the desired objective.
- 2.
This has an obvious value in the case of financial investments, related to the inflation or interest rates. For other applications, there usually is not a clear way to determine the discount factor, and in general, a value close to one, such as 0.9, is used.
References
Avilés-Arriaga, H.H., Sucar, L.E., Morales, E.F., Vargas, B.A., Corona, E.: Markovito: a flexible and general service robot. In: Liu, D., Wang, L., Tan, K.C. (eds.) Computational Intelligence in Autonomous Robotic Systems, pp. 401–423. Springer, Berlin (2009)
Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)
Corona, E., Morales, E.F., Sucar, L.E.: Solving policy conflicts in concurrent Markov decision processes. In: ICAPS Workshop on Planning and Scheduling Under Uncertainty, Association for the Advancement of Artificial Intelligence (2010)
Corona, E., Sucar, L.E.: Task coordination for service robots based on multiple Markov decision processes. In: Sucar, L.E., Hoey, J., Morales, E. (eds.) Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions. IGI Global, Hershey (2011)
Dean, T., Givan, R.: Model minimization in Markov decision processes. In: Proceedings of the 14th National Conference on Artificial Intelligence (AAAI), pp. 106–111 (1997)
Dietterich, T.: Hierarchical reinforcement learning with the MAXQ value function decomposition. J. Artif. Intell. Res. 13, 227–303 (2000)
Elinas, P., Sucar, L., Reyes, A., Hoey, J.: A decision theoretic approach for task coordination in social robots. In: Proceedings of the IEEE International Workshop on Robot and Human Interactive Communication (RO-MAN), pp. 679–684 (2004)
Hoey, J., St-Aubin, R., Hu, A., Boutilier, C.: SPUDD: stochastic planning using decision diagrams. In: Proceedings of the International Conference on Uncertainty in Artificial Intelligence (UAI), pp. 279–288 (1999)
Li, L., Walsh, T.J., Littman, M.L.: Towards a unified theory of state abstraction for MDPs. In: Proceedings of the Nineth International Symposium on Artificial Intelligence and Mathematics, pp. 21–30 (2006)
Meuleau, N., Hauskrecht, M., Kim, K.E., Peshkin, L., Kaelbling, L.P., Dean, T., Boutilier, C.: Solving very large weakly coupled Markov decision processes. In: Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), pp. 165–172 (1998)
Parr, R., Russell, S. J.: Reinforcement learning with hierarchies of machines. In: Proceeding of the Advances in Neural Information Processing Systems (NIPS) (1997)
Poupart, P.: An introduction to fully and partially observable Markov decision processes. In: Sucar, L.E., Hoey, J., Morales, E. (eds.) Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions. IGI Global, Hershey (2011)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Reyes, A., Sucar, L.E., Morales, E.F., Ibargüngoytia, P.: Hybrid Markov Decision Processes. Lecture Notes in Computer Science, vol. 4293. Springer, Berlin (2006)
Reyes, A., Sucar, L.E., Morales, E.F.: AsistO: a qualitative MDP-based recommender system for power plant operation. Computacion y Sistemas 13(1), 5–20 (2009)
Ross, S., Pineau, J., Paquet, S., Chaib-draa, B.: Online planning algorithms for POMDPs. J. Artif. Intell. Res. 32, 663–704 (2008)
Sucar, L.E., Hoey, J., Morales, E.: Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions. IGI Global, Hershey (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 Springer-Verlag London
About this chapter
Cite this chapter
Sucar, L.E. (2015). Markov Decision Processes. In: Probabilistic Graphical Models. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-6699-3_11
Download citation
DOI: https://doi.org/10.1007/978-1-4471-6699-3_11
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-6698-6
Online ISBN: 978-1-4471-6699-3
eBook Packages: Computer ScienceComputer Science (R0)