Finite Markov Chains and Markov Decision Processes
Markov chains are important tools used for stochastic modeling in various areas of mathematical sciences. The first section of this article presents a survey of the basic notions of discrete-time Markov chains on finite state spaces together with several illustrative examples. Markov decision processes (MDPs), which are also known as stochastic dynamic programming or discrete-time stochastic control, are useful for decision making under uncertainty. The second section will provide a simple formulation of MDPs with finite state spaces and actions, and give two important algorithms for solving MDPs, value iteration and policy iteration, with an example on iPod shuffle.
KeywordsMarkov chain Markov decision process Mixing time Coupling Cutoff phenomenon
- 1.D.A. Levin, Y. Peres, E.L. Wilmer, Markov Chains and Mixing Times (American Mathematical Society, Providence, 2009)Google Scholar
- 2.J.R. Norris, Markov Chains (Cambridge University Press, Cambridge, 1997)Google Scholar
- 3.P. Norvig, Doing the Martin Shuffle (with your iPod). Available at http://norvig.com/ipod.html