Skip to main content

Markov Decision Processes

  • Chapter
  • First Online:
Probabilistic Graphical Models

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

  • 7270 Accesses

Abstract

This chapter introduces sequential decision problems, in particular Markov decision processes (MDPs). A formal definition of an MDP is given, and the two most common solution techniques are described: value iteration and policy iteration. Then, factored MDPs are described, which provide a representation based on graphical models to solve very large MDPs. An introduction to partially observable MDPs (POMDPs) is also included. The chapter concludes by describing two applications of MDPs: power plant control and service robot task coordination.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This assumes that the defined reward function correctly models the desired objective.

  2. 2.

    This has an obvious value in the case of financial investments, related to the inflation or interest rates. For other applications, there usually is not a clear way to determine the discount factor, and in general, a value close to one, such as 0.9, is used.

References

  1. Avilés-Arriaga, H.H., Sucar, L.E., Morales, E.F., Vargas, B.A., Corona, E.: Markovito: a flexible and general service robot. In: Liu, D., Wang, L., Tan, K.C. (eds.) Computational Intelligence in Autonomous Robotic Systems, pp. 401–423. Springer, Berlin (2009)

    Google Scholar 

  2. Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)

    MATH  Google Scholar 

  3. Corona, E., Morales, E.F., Sucar, L.E.: Solving policy conflicts in concurrent Markov decision processes. In: ICAPS Workshop on Planning and Scheduling Under Uncertainty, Association for the Advancement of Artificial Intelligence (2010)

    Google Scholar 

  4. Corona, E., Sucar, L.E.: Task coordination for service robots based on multiple Markov decision processes. In: Sucar, L.E., Hoey, J., Morales, E. (eds.) Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions. IGI Global, Hershey (2011)

    Google Scholar 

  5. Dean, T., Givan, R.: Model minimization in Markov decision processes. In: Proceedings of the 14th National Conference on Artificial Intelligence (AAAI), pp. 106–111 (1997)

    Google Scholar 

  6. Dietterich, T.: Hierarchical reinforcement learning with the MAXQ value function decomposition. J. Artif. Intell. Res. 13, 227–303 (2000)

    MATH  MathSciNet  Google Scholar 

  7. Elinas, P., Sucar, L., Reyes, A., Hoey, J.: A decision theoretic approach for task coordination in social robots. In: Proceedings of the IEEE International Workshop on Robot and Human Interactive Communication (RO-MAN), pp. 679–684 (2004)

    Google Scholar 

  8. Hoey, J., St-Aubin, R., Hu, A., Boutilier, C.: SPUDD: stochastic planning using decision diagrams. In: Proceedings of the International Conference on Uncertainty in Artificial Intelligence (UAI), pp. 279–288 (1999)

    Google Scholar 

  9. Li, L., Walsh, T.J., Littman, M.L.: Towards a unified theory of state abstraction for MDPs. In: Proceedings of the Nineth International Symposium on Artificial Intelligence and Mathematics, pp. 21–30 (2006)

    Google Scholar 

  10. Meuleau, N., Hauskrecht, M., Kim, K.E., Peshkin, L., Kaelbling, L.P., Dean, T., Boutilier, C.: Solving very large weakly coupled Markov decision processes. In: Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), pp. 165–172 (1998)

    Google Scholar 

  11. Parr, R., Russell, S. J.: Reinforcement learning with hierarchies of machines. In: Proceeding of the Advances in Neural Information Processing Systems (NIPS) (1997)

    Google Scholar 

  12. Poupart, P.: An introduction to fully and partially observable Markov decision processes. In: Sucar, L.E., Hoey, J., Morales, E. (eds.) Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions. IGI Global, Hershey (2011)

    Google Scholar 

  13. Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)

    Book  MATH  Google Scholar 

  14. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Google Scholar 

  15. Reyes, A., Sucar, L.E., Morales, E.F., Ibargüngoytia, P.: Hybrid Markov Decision Processes. Lecture Notes in Computer Science, vol. 4293. Springer, Berlin (2006)

    Google Scholar 

  16. Reyes, A., Sucar, L.E., Morales, E.F.: AsistO: a qualitative MDP-based recommender system for power plant operation. Computacion y Sistemas 13(1), 5–20 (2009)

    Google Scholar 

  17. Ross, S., Pineau, J., Paquet, S., Chaib-draa, B.: Online planning algorithms for POMDPs. J. Artif. Intell. Res. 32, 663–704 (2008)

    MATH  MathSciNet  Google Scholar 

  18. Sucar, L.E., Hoey, J., Morales, E.: Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions. IGI Global, Hershey (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luis Enrique Sucar .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag London

About this chapter

Cite this chapter

Sucar, L.E. (2015). Markov Decision Processes. In: Probabilistic Graphical Models. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-6699-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-6699-3_11

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-6698-6

  • Online ISBN: 978-1-4471-6699-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics