Abstract
As discussed in Chapter 1, reinforcement learning involves sequential decision-making. In this chapter, we will formalize the notion of using stochastic processes under the branch of probability that models sequential decision-making behavior. While most of the problems we study in reinforcement learning are modeled as Markov decision processes (MDP), we start by first introducing Markov chains (MC) followed by Markov reward processes (MRP). We finish up by discussing MDP in-depth while covering model setup and the assumptions behind MDP.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2021 Nimish Sanghi
About this chapter
Cite this chapter
Sanghi, N. (2021). Markov Decision Processes. In: Deep Reinforcement Learning with Python. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-6809-4_2
Download citation
DOI: https://doi.org/10.1007/978-1-4842-6809-4_2
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-6808-7
Online ISBN: 978-1-4842-6809-4
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)