Skip to main content

Reinforcement Learning: A Friendly Introduction

  • Conference paper
  • First Online:
The International Conference on Deep Learning, Big Data and Blockchain (Deep-BDB 2021) (Deep-BDB 2021)

Abstract

Reinforcement Learning (RL) is a branch of machine learning (ML) that is used to train artificial intelligence (AI) systems and find the optimal solution for problems. This tutorial paper aims to present an introductory overview of the RL. Furthermore, we discuss the most popular algorithms used in RL and the Markov decision process (MDP) usage in the RL environment. Moreover, RL applications and achievements that shine in the world of AI are covered.

The Khalifa University, UAE partially support this research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Abbeel, P., Coates, A., Quigley, M., Ng, A.Y.: An application of reinforcement learning to aerobatic helicopter flight. In: Advances in Neural Information Processing Systems, pp. 1–8 (2007)

    Google Scholar 

  2. Achiam, J.: Introduction to RL (2018). BOpen AI. https://spinningup.openai.com/en/latest/spinningup/rl_intro.html

  3. Arabnejad, H., Pahl, C., Jamshidi, P., Estrada, G.: A comparison of reinforcement learning techniques for fuzzy cloud auto-scaling. In: Proceedings of the 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 64–73 (2017)

    Google Scholar 

  4. Arjona-Medina, J.A., Gillhofer, M., Widrich, M., Unterthiner, T., Brandstetter, J., Hochreiter, S.: RUDDER: return decomposition for delayed rewards. arXiv preprint arXiv:1806.07857 (2018)

  5. Bellman, R.: On the theory of dynamic programming. Proc. Natl. Acad. Sci. U.S.A. 38(8), 716 (1952)

    Article  MathSciNet  Google Scholar 

  6. Bellman, R.E., Dreyfus, S.E.: Applied Dynamic Programming. Princeton University Press, Princeton (2015)

    MATH  Google Scholar 

  7. Bu, X., Rao, J., Xu, C.Z.: A reinforcement learning approach to online web systems auto-configuration. In: Proceedings of the 2009 29th IEEE International Conference on Distributed Computing Systems, pp. 2–11 (2009)

    Google Scholar 

  8. Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)

    Article  MathSciNet  Google Scholar 

  9. Chan, S.C., Fishman, S., Canny, J., Korattikara, A., Guadarrama, S.: Measuring the reliability of reinforcement learning algorithms. arXiv preprint arXiv:1912.05663 (2019)

  10. De Luca, G.: What is a policy in reinforcement learning? (2020). Baeldung. https://www.baeldung.com/cs/ml-policy-reinforcement-learning

  11. Dimov, I.T., Tonev, O.I.: Monte Carlo algorithms: performance analysis for some computer architectures. J. Comput. Appl. Math. 48(3), 253–277 (1993)

    Article  MathSciNet  Google Scholar 

  12. Dulac-Arnold, G., Mankowitz, D., Hester, T.: Challenges of real-world reinforcement learning. arXiv preprint arXiv:1904.12901 (2019)

  13. Fazly, R.: Data science book (2020). GitHub. https://github.com/FazlyRabbiBD/Data-Science-Book/blob/master/8-ReinforcementLearning.ipynb

  14. Guru99: Reinforcement learning: what is, algorithms, applications, example (2020). Guru99. https://www.guru99.com/reinforcement-learning-tutorial.html

  15. Hester, T., Stone, P.: TEXPLORE: real-time sample-efficient reinforcement learning for robots. Mach. Learn. 90(3), 385–429 (2013)

    Article  MathSciNet  Google Scholar 

  16. Hui, J.: RL – value learning (2018). Medium. https://jonathan-hui.medium.com/rl-value-learning-24f52b49c36d

  17. Jiang, J., Dun, C., Huang, T., Lu, Z.: Graph convolutional reinforcement learning. arXiv preprint arXiv:1810.09202 (2018)

  18. Jin, J., Song, C., Li, H., Gai, K., Wang, J., Zhang, W.: Real-time bidding with multi-agent reinforcement learning in display advertising. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 2193–2201 (2018)

    Google Scholar 

  19. Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)

    Article  Google Scholar 

  20. Konda, V.R., Tsitsiklis, J.N.: On actor-critic algorithms. SIAM J. Control Optim. 42(4), 1143–1166 (2003)

    Article  MathSciNet  Google Scholar 

  21. Lee, K., Lee, K., Shin, J., Lee, H.: Network randomization: a simple technique for generalization in deep reinforcement learning. arXiv preprint arXiv:1910.05396 (2019)

  22. Manju, S., Punithavalli, M.: An analysis of Q-learning algorithms with strategies of reward function. Int. J. Comput. Sci. Eng. 3(2), 814–820 (2011)

    Google Scholar 

  23. Mann, T.A., et al.: Learning from delayed outcomes via proxies with applications to recommender systems. In: International Conference on Machine Learning, pp. 4324–4332. PMLR (2019)

    Google Scholar 

  24. Mao, H., Alizadeh, M., Menache, I., Kandula, S.: Resource management with deep reinforcement learning. In: Proceedings of the 15th ACM Workshop on Hot Topics in Networks, pp. 50–56 (2016)

    Google Scholar 

  25. Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)

  26. Moazami, S., Doerschuk, P.: Modeling survival in model-based reinforcement learning. arXiv preprint arXiv:2004.08648 (2020)

  27. Mondal, A.K., Jamali, N.: A survey of reinforcement learning techniques: strategies, recent development, and future directions. arXiv preprint arXiv:2001.06921 (2020)

  28. Osband, I., et al.: Behaviour suite for reinforcement learning. arXiv preprint arXiv:1908.03568 (2019)

  29. Van der Pol, E., Oliehoek, F.A.: Coordinated deep reinforcement learners for traffic light control. In: Proceedings of the NIPS 2016 Workshop on Learning, Inference and Control of Multi-Agent Systems, pp. 1–8 (2016)

    Google Scholar 

  30. Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Technical report, Cambridge University Engineering Department, UK (1994)

    Google Scholar 

  31. Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 3(3), 210–229 (1959)

    Article  MathSciNet  Google Scholar 

  32. SAS: Machine learning: what it is and why it matters (2020), SAS. https://www.sas.com/en_us/insights/analytics/machine-learning.html

  33. Sharma, A.R., Kaushik, P.: Literature survey of statistical, deep and reinforcement learning in natural language processing. In: Proceedings of the 2017 IEEE International Conference on Computing, Communication and Automation, pp. 350–354 (2017)

    Google Scholar 

  34. Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)

    Article  Google Scholar 

  35. Silver, D., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017)

  36. Singh, A.: Reinforcement learning: Bellman equation and optimality (Part 2). Towards Data Sci. (2019). https://towardsdatascience.com/reinforcement-learning-markov-decision-process-part-2-96837c936ec3

  37. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  38. Taylor, G., Burmeister, R., Xu, Z., Singh, B., Patel, A., Goldstein, T.: Training neural networks without gradients: a scalable ADMM approach. In: Proceedings of the 33rd International Conference on Machine Learning, pp. 2722–2731 (2016)

    Google Scholar 

  39. Tesauro, G.: TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput. 6(2), 215–219 (1994)

    Article  Google Scholar 

  40. Woergoetter, F., Porr, B.: Reinforcement learning. ScholarPedia 3, 1448 (2008)

    Article  Google Scholar 

  41. Zhang, J.: Reinforcement learning - model based planning methods. Towards Data Science (2020). https://towardsdatascience.com/reinforcement-learning-model-based-planning-methods-5e99cae0abb8

  42. Zheng, G., et al.: DRN: a deep reinforcement learning framework for news recommendation. In: Proceedings of the 2018 World Wide Web Conference, pp. 167–176 (2018)

    Google Scholar 

  43. Zhou, Z., Li, X., Zare, R.N.: Optimizing chemical reactions with deep reinforcement learning. ACS Cent. Sci. 3(12), 1337–1344 (2017)

    Article  Google Scholar 

  44. Zhu, H.: The ingredients of real-world robotic reinforcement learning. arXiv preprint arXiv:2004.12570 (2020)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Abdul Azim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Daoun, D., Ibnat, F., Alom, Z., Aung, Z., Azim, M.A. (2022). Reinforcement Learning: A Friendly Introduction. In: Awan, I., Benbernou, S., Younas, M., Aleksy, M. (eds) The International Conference on Deep Learning, Big Data and Blockchain (Deep-BDB 2021). Deep-BDB 2021. Lecture Notes in Networks and Systems, vol 309. Springer, Cham. https://doi.org/10.1007/978-3-030-84337-3_11

Download citation

Publish with us

Policies and ethics