Abstract
Advances in reinforcement learning research have demonstrated the ways in which different agent-based models can learn how to optimally perform a task within a given environment. Reinforcement leaning solves unsupervised problems where agents move through a state-action-reward loop to maximize the overall reward for the agent, which in turn optimizes the solving of a specific problem in a given environment. However, these algorithms are designed based on our understanding of actions that should be taken in a real-world environment to solve a specific problem. One such problem is the ability to identify, recommend and execute an action within a system where the users are the subject, such as in education. In recent years, the use of blended learning approaches integrating face-to-face learning with online learning in the education context, has increased. Additionally, online platforms used for education require the automation of certain functions such as the identification, recommendation or execution of actions that can benefit the user, in this sense, the student or learner. As promising as these scientific advances are, there is still a need to conduct research in a variety of different areas to ensure the successful deployment of these agents within education systems. Therefore, the aim of this study was to contextualise and simulate the cumulative reward within an environment for an intervention recommendation problem in the education context.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Source: https://www.sciencedirect.com/science/article/pii/S00078 specifix50618301471.
- 2.
References
Coetzee, J., Neneh, B., Stemmet, K., Lamprecht, J., Motsitsi, C., Sereeco, W.: South African universities in a time of increasing disruption. South African J. Econ. Manage. Sci. 24(1), 1–12 (2021)
Rashied, N., Bhamjee, M.: Does the global south need to decolonise the fourth industrial revolution? In: Doorsamy, W., Paul, B.S., Marwala, T. (eds.) The Disruptive Fourth Industrial Revolution. LNEE, vol. 674, pp. 95–110. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-48230-5_5
Oke, A., Fernandes, F.A.P.: Innovations in teaching and learning: exploring the perceptions of the education sector on the 4th industrial revolution (4IR). J. Open Innov. Technol. Market Complex. 6(2), 31 (2020)
Gamede, B.T., Ajani, O.A., Afolabi, O.S.: Exploring the adoption and usage of learning management system as alternative for curriculum delivery in South African higher education institutions during COVID-19 lockdown. Int. J. High. Educ. 11(1), 71–84 (2022)
Bortolini, M., Faccio, M., Galizia, F.G., Gamberi, M., Pilati, F.: Design, engineering and testing of an innovative adaptive automation assembly system. Assembly Autom. (2020)
D’Addona, D.M., Bracco, F., Bettoni, A., Nishino, N., Carpanzano, E., Bruzzone, A.A.: Adaptive automation and human factors in manufacturing: An experimental assessment for a cognitive approach. CIRP Ann. 67(1), 455–458 (2018)
Dwivedi, S., Roshni, V.K.: Recommender system for big data in education. In: 2017 5th National Conference on E-Learning & E-Learning Technologies (ELELTECH), pp. 1–4. IEEE (2017)
Obeid, C., Lahoud, I., El Khoury, H., Champin, P.A.: Ontology-based recommender system in higher education. In: Companion Proceedings of the The Web Conference 2018, pp. 1031–1034 (2018)
Li, Q., Kim, J.: A deep learning-based course recommender system for sustainable development in education. Appl. Sci. 11(19), 8993 (2021)
Nouh, R.M., Lee, H.H., Lee, W.J., Lee, J.D.: A smart recommender based on hybrid learning methods for personal well-being services. Sensors 19(2), 431 (2019)
Zheng, Z., Ma, H., Lyu, M.R., King, I.: Wsrec: a collaborative filtering based web service recommender system. In: 2009 IEEE International Conference on Web Services, pp. 437–444. IEEE (2009)
Geetha, G., Safa, M., Fancy, C., Saranya, D.: A hybrid approach using collaborative filtering and content based filtering for recommender system. In: Journal of Physics: Conference Series, vol. 1000, no. 1, p. 012101. IOP Publishing (2018)
Gaw, F.: Algorithmic logics and the construction of cultural taste of the Netflix Recommender System. Media Cult. Soc. 44(4), 706–725 (2022)
Anwar, T., Uma, V.: A review of recommender system and related dimensions. Data, Engineering and Applications, pp. 3–10 (2019)
Afoudi, Y., Lazaar, M., Al Achhab, M.: Hybrid recommendation system combined content-based filtering and collaborative prediction using artificial neural network. Simul. Model. Pract. Theory 113, 102375 (2021)
Lika, B., Kolomvatsos, K., Hadjiefthymiades, S.: Facing the cold start problem in recommender systems. Expert Syst. Appl. 41(4), 2065–2073 (2014)
Natarajan, S., Vairavasundaram, S., Natarajan, S., Gandomi, A.H.: Resolving data sparsity and cold start problem in collaborative filtering recommender system using linked open data. Expert Syst. Appl. 149, 113248 (2020)
de Graaff, V., van de Venis, A., van Keulen, M., Rolf, A.: Generic knowledge-based analysis of social media for recommendations. In: CBRecSys@ RecSys, pp. 22–29 (2015)
Chen, L.-C., Kuo, P.-J., Liao, I.-E.: Ontology-based library recommender system using MapReduce. Clust. Comput. 18(1), 113–121 (2014). https://doi.org/10.1007/s10586-013-0342-z
Ma, C., Gong, W., Hernández-Lobato, J.M., Koenigstein, N., Nowozin, S., Zhang, C.: Partial VAE for hybrid recommender system. In: NIPS Workshop on Bayesian Deep Learning, vol. 2018 (2018)
Gräßer, F., et al.: Therapy decision support based on recommender system methods. J. Healthcare Eng. (2017)
Hu, Y., Chapman, A., Wen, G., Hall, D.W.: What can knowledge bring to machine learning?—a survey of low-shot learning for structured data. ACM Trans. Intell. Syst. Technol. 13(3), 1–45 (2022)
Dayan, P., Balleine, B.W.: Reward, motivation, and reinforcement learning. Neuron 36(2), 285–298 (2002)
Ludvig, E.A., Bellemare, M.G., Pearson, K.G.: A primer on reinforcement learning in the brain: psychological, computational, and neural perspectives. In: Computational Neuroscience for Advancing Artificial Intelligence: Models, Methods and Applications, pp. 111–144. IGI Global (2011)
Even-Dar, E., Mannor, S., Mansour, Y., Mahadevan, S.: Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. J. Mach. Learn. Res. 7(6) (2006)
Koulouriotis, D.E., Xanthopoulos, A.: Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems. Appl. Math. Comput. 196(2), 913–922 (2008)
Wang, K., Liu, Q., Chen, L.: Optimality of greedy policy for a class of standard reward function of restless multi-armed bandit problem. IET Signal Proc. 6(6), 584–593 (2012)
Krishnamurthy, V., Wahlberg, B., Lingelbach, F.: A value iteration algorithm for partially observed markov decision process multi-armed bandits. Math. Oper. Res. 133–152 (2005)
Rosman, B., Hawasly, M., Ramamoorthy, S.: Bayesian policy reuse. Mach. Learn. 104(1), 99–127 (2016). https://doi.org/10.1007/s10994-016-5547-y
Agarwal, S., Rodriguez, M.A., Buyya, R.: A reinforcement learning approach to reduce serverless function cold start frequency. In: 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid), pp. 797–803. IEEE (2021)
Tabatabaei, S.A., Hoogendoorn, M., van Halteren, A.: Narrowing reinforcement learning: overcoming the cold start problem for personalized health interventions. In: Miller, T., Oren, N., Sakurai, Y., Noda, I., Savarimuthu, B.T.R., Cao Son, T. (eds.) PRIMA 2018. LNCS (LNAI), vol. 11224, pp. 312–327. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03098-8_19
Zou, L., et al.: Pseudo Dyna-Q: a reinforcement learning framework for interactive recommendation. In: Proceedings of the 13th International Conference on Web Search and Data Mining, pp. 816–824 (2020)
MacGregor, K.: Access, retention and student success–a global view. Student Affairs and Services in Higher Education: Global Foundations, Issues, and Best Practices Third Edition, vol. 107
Rajagopalan, R., Midgley, G.: Knowing differently in systemic intervention. Syst. Res. Behav. Sci. 32(5), 546–561 (2015)
Burns, M.K., Deno, S.L., Jimerson, S.R.: Toward a unified response-to-intervention model. In: Jimerson, S.R., Burns, M.K., VanDerHeyden, A.M. (eds.) Handbook of Response to Intervention. Springer, Boston, MA (2007). https://doi.org/10.1007/978-0-387-49053-3_32
Zhao, C., Watanabe, K., Yang, B., Hirate, Y.: Fast converging multi-armed bandit optimization using probabilistic graphical model. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds.) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. LNCS, vol. 10938. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93037-4_10
Leitner, P., Khalil, M., Ebner, M.: Learning analytics in higher education—a literature review. Learning analytics: Fundaments, applications, and trends, pp.1–23 (2017)
Gupta, S.: Higher education management, policies and strategies. J. Bus. Manage. Qual. Assur. (e ISSN 2456–9291) 1(1), 5–11 (2020)
Kuh, G.D., Kinzie, J.: What really makes a “high-impact” practice high impact. Inside Higher Ed (2018)
Organ, D., et al.: A systematic review of user-centred design practices in illicit substance use interventions for higher education students. In: European Conference on Information Systems 2018: Beyond Digitization-Facets of Socio-Technical Change. AIS Electronic Library (AISeL) (2018)
Cupák, A., Fessler, P., Silgoner, M., Ulbrich, E.: Exploring differences in financial literacy across countries: the role of individual characteristics and institutions. Soc. Indic. Res. 1–30 (2021)
Lacave, C., Molina, A.I., Cruz-Lemus, J.A.: Learning Analytics to identify dropout factors of Computer Science studies through Bayesian networks. Behav. Inform. Technol. 37(10–11), 993–1007 (2018). (Fundaments, applications, and trends, pp.1–23)
Scanagatta, M., Salmerón, A., Stella, F.: A survey on Bayesian network structure learning from data. Progress Artific. Intell. 8(4), 425–439 (2019). https://doi.org/10.1007/s13748-019-00194-y
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Combrink, H.M., Marivate, V., Rosman, B. (2023). Reinforcement Learning in Education: A Multi-armed Bandit Approach. In: Masinde, M., Bagula, A. (eds) Emerging Technologies for Developing Countries. AFRICATEK 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 503. Springer, Cham. https://doi.org/10.1007/978-3-031-35883-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-35883-8_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35882-1
Online ISBN: 978-3-031-35883-8
eBook Packages: Computer ScienceComputer Science (R0)