Skip to main content

Dynamic Programming and Reinforcement Learning

  • Chapter
  • First Online:
Machine Learning and Artificial Intelligence
  • 8152 Accesses

Abstract

In this chapter we will study dynamic programming. Starting with the fundamental equation of dynamic programming as defined by Bellman, we will further dive deep into its generalization. We will understand the class of problems that can be solved with the framework of dynamic programming. Then we will study reinforcement learning as one subcategory of dynamic programming in detail. We will study the concepts of exploration and exploitation and the optimal tradeoff between them to achieve the best performance. We will also look at some variation of the reinforcement learning in the form of Q-learning and SARSA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wikipedia - Dynamic Programming Applications https://en.wikipedia.org/wiki/Dynamic_programming#Algorithms_that_use_dynamic_programming

  2. Shannon number https://en.wikipedia.org/wiki/Shannon_number

  3. Deep Blue (chess computer) https://en.wikipedia.org/wiki/Deep_Blue_(chess_computer)

  4. Setting up Mario Bros. in OpenAI’s gym https://becominghuman.ai/getting-mario-back-into-the-gym-setting-up-super-mario-bros-in-openais-gym-8e39a96c1e41

  5. Open AI Gym http://gym.openai.com/

  6. Richard Bellman, Dynamic Programming, (Dover Publications, Inc., New York, 2003).

    Google Scholar 

  7. David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis, Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm, AxXiv e-prints, Dec 2017.

    Google Scholar 

  8. G. A. Rummery, Mahesh Niranjan On-Line Q-Learning using Connectionist Systems, volume 37. University of Cambridge, Department of Engineering.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Joshi, A.V. (2020). Dynamic Programming and Reinforcement Learning. In: Machine Learning and Artificial Intelligence. Springer, Cham. https://doi.org/10.1007/978-3-030-26622-6_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-26622-6_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-26621-9

  • Online ISBN: 978-3-030-26622-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics