Abstract
Reinforcement Learning (RL) is a branch of machine learning (ML) that is used to train artificial intelligence (AI) systems and find the optimal solution for problems. This tutorial paper aims to present an introductory overview of the RL. Furthermore, we discuss the most popular algorithms used in RL and the Markov decision process (MDP) usage in the RL environment. Moreover, RL applications and achievements that shine in the world of AI are covered.
The Khalifa University, UAE partially support this research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abbeel, P., Coates, A., Quigley, M., Ng, A.Y.: An application of reinforcement learning to aerobatic helicopter flight. In: Advances in Neural Information Processing Systems, pp. 1–8 (2007)
Achiam, J.: Introduction to RL (2018). BOpen AI. https://spinningup.openai.com/en/latest/spinningup/rl_intro.html
Arabnejad, H., Pahl, C., Jamshidi, P., Estrada, G.: A comparison of reinforcement learning techniques for fuzzy cloud auto-scaling. In: Proceedings of the 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 64–73 (2017)
Arjona-Medina, J.A., Gillhofer, M., Widrich, M., Unterthiner, T., Brandstetter, J., Hochreiter, S.: RUDDER: return decomposition for delayed rewards. arXiv preprint arXiv:1806.07857 (2018)
Bellman, R.: On the theory of dynamic programming. Proc. Natl. Acad. Sci. U.S.A. 38(8), 716 (1952)
Bellman, R.E., Dreyfus, S.E.: Applied Dynamic Programming. Princeton University Press, Princeton (2015)
Bu, X., Rao, J., Xu, C.Z.: A reinforcement learning approach to online web systems auto-configuration. In: Proceedings of the 2009 29th IEEE International Conference on Distributed Computing Systems, pp. 2–11 (2009)
Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)
Chan, S.C., Fishman, S., Canny, J., Korattikara, A., Guadarrama, S.: Measuring the reliability of reinforcement learning algorithms. arXiv preprint arXiv:1912.05663 (2019)
De Luca, G.: What is a policy in reinforcement learning? (2020). Baeldung. https://www.baeldung.com/cs/ml-policy-reinforcement-learning
Dimov, I.T., Tonev, O.I.: Monte Carlo algorithms: performance analysis for some computer architectures. J. Comput. Appl. Math. 48(3), 253–277 (1993)
Dulac-Arnold, G., Mankowitz, D., Hester, T.: Challenges of real-world reinforcement learning. arXiv preprint arXiv:1904.12901 (2019)
Fazly, R.: Data science book (2020). GitHub. https://github.com/FazlyRabbiBD/Data-Science-Book/blob/master/8-ReinforcementLearning.ipynb
Guru99: Reinforcement learning: what is, algorithms, applications, example (2020). Guru99. https://www.guru99.com/reinforcement-learning-tutorial.html
Hester, T., Stone, P.: TEXPLORE: real-time sample-efficient reinforcement learning for robots. Mach. Learn. 90(3), 385–429 (2013)
Hui, J.: RL – value learning (2018). Medium. https://jonathan-hui.medium.com/rl-value-learning-24f52b49c36d
Jiang, J., Dun, C., Huang, T., Lu, Z.: Graph convolutional reinforcement learning. arXiv preprint arXiv:1810.09202 (2018)
Jin, J., Song, C., Li, H., Gai, K., Wang, J., Zhang, W.: Real-time bidding with multi-agent reinforcement learning in display advertising. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 2193–2201 (2018)
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)
Konda, V.R., Tsitsiklis, J.N.: On actor-critic algorithms. SIAM J. Control Optim. 42(4), 1143–1166 (2003)
Lee, K., Lee, K., Shin, J., Lee, H.: Network randomization: a simple technique for generalization in deep reinforcement learning. arXiv preprint arXiv:1910.05396 (2019)
Manju, S., Punithavalli, M.: An analysis of Q-learning algorithms with strategies of reward function. Int. J. Comput. Sci. Eng. 3(2), 814–820 (2011)
Mann, T.A., et al.: Learning from delayed outcomes via proxies with applications to recommender systems. In: International Conference on Machine Learning, pp. 4324–4332. PMLR (2019)
Mao, H., Alizadeh, M., Menache, I., Kandula, S.: Resource management with deep reinforcement learning. In: Proceedings of the 15th ACM Workshop on Hot Topics in Networks, pp. 50–56 (2016)
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Moazami, S., Doerschuk, P.: Modeling survival in model-based reinforcement learning. arXiv preprint arXiv:2004.08648 (2020)
Mondal, A.K., Jamali, N.: A survey of reinforcement learning techniques: strategies, recent development, and future directions. arXiv preprint arXiv:2001.06921 (2020)
Osband, I., et al.: Behaviour suite for reinforcement learning. arXiv preprint arXiv:1908.03568 (2019)
Van der Pol, E., Oliehoek, F.A.: Coordinated deep reinforcement learners for traffic light control. In: Proceedings of the NIPS 2016 Workshop on Learning, Inference and Control of Multi-Agent Systems, pp. 1–8 (2016)
Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Technical report, Cambridge University Engineering Department, UK (1994)
Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 3(3), 210–229 (1959)
SAS: Machine learning: what it is and why it matters (2020), SAS. https://www.sas.com/en_us/insights/analytics/machine-learning.html
Sharma, A.R., Kaushik, P.: Literature survey of statistical, deep and reinforcement learning in natural language processing. In: Proceedings of the 2017 IEEE International Conference on Computing, Communication and Automation, pp. 350–354 (2017)
Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Silver, D., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017)
Singh, A.: Reinforcement learning: Bellman equation and optimality (Part 2). Towards Data Sci. (2019). https://towardsdatascience.com/reinforcement-learning-markov-decision-process-part-2-96837c936ec3
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Taylor, G., Burmeister, R., Xu, Z., Singh, B., Patel, A., Goldstein, T.: Training neural networks without gradients: a scalable ADMM approach. In: Proceedings of the 33rd International Conference on Machine Learning, pp. 2722–2731 (2016)
Tesauro, G.: TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput. 6(2), 215–219 (1994)
Woergoetter, F., Porr, B.: Reinforcement learning. ScholarPedia 3, 1448 (2008)
Zhang, J.: Reinforcement learning - model based planning methods. Towards Data Science (2020). https://towardsdatascience.com/reinforcement-learning-model-based-planning-methods-5e99cae0abb8
Zheng, G., et al.: DRN: a deep reinforcement learning framework for news recommendation. In: Proceedings of the 2018 World Wide Web Conference, pp. 167–176 (2018)
Zhou, Z., Li, X., Zare, R.N.: Optimizing chemical reactions with deep reinforcement learning. ACS Cent. Sci. 3(12), 1337–1344 (2017)
Zhu, H.: The ingredients of real-world robotic reinforcement learning. arXiv preprint arXiv:2004.12570 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Daoun, D., Ibnat, F., Alom, Z., Aung, Z., Azim, M.A. (2022). Reinforcement Learning: A Friendly Introduction. In: Awan, I., Benbernou, S., Younas, M., Aleksy, M. (eds) The International Conference on Deep Learning, Big Data and Blockchain (Deep-BDB 2021). Deep-BDB 2021. Lecture Notes in Networks and Systems, vol 309. Springer, Cham. https://doi.org/10.1007/978-3-030-84337-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-84337-3_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-84336-6
Online ISBN: 978-3-030-84337-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)