Both educational data mining and learning analytics aim to understand learners and optimise learning processes of educational settings like Moodle, a learning management system (LMS). Analytics in an LMS covers many different aspects: finding students at risk of abandoning a course or identifying students with difficulties before the assessments. Thus, there are multiple prediction models that can be explored. The prediction models can target at the course also. For instance, will this activity assessment engage learners? To ease the evaluation and usage of prediction models in Moodle, we abstract out the most relevant elements of prediction models and develop an analytics framework for Moodle. Apart from the software framework, we also present a case study model which uses variables based on assessments to predict students at risk of dropping out of a massive open online course that has been offered eight times from 2013 to 2018, including a total of 46,895 students. A neural network is trained with data from past courses and the framework generates insights about students at risk in ongoing courses. Predictions are then generated after the first, the second, and the third quarters of the course. The average accuracy that we achieve is 88.81% with a 0.9337 F1 score and a 73.12% of the area under the ROC curve.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Abdullah, M., Alqahtani, A., Aljabri, J., Altowirgi, R., & Fallatah, R. (2015). Learning style classification based on student’s behavior in Moodle learning management system. Transactions on Machine Learning and Artificial Intelligence, 3(1), 28.
Adamopoulos, P. (2013). What makes a great MOOC? An interdisciplinary analysis of student retention in online courses. In Thirty fourth international conference on information systems: ICIS 2013.
Aleman de la Garza, L. (2016). Research analysis on MOOC course dropout and retention rates. Turkish Online Journal of Distance Education, 17(April), 3–14.
Aljawarneh, S. A. (2019). Reviewing and exploring innovative ubiquitous learning tools in higher education. Journal of Computing in Higher Education. https://doi.org/10.1007/s12528-019-09207-0.
Bakhshinategh, B., Zaiane, O. R., ElAtia, S., & Ipperciel, D. (2018). Educational data mining applications and tasks: A survey of the last 10 years. Education and Information Technologies, 23(1), 537–553.
Bogarín, A., Cerezo, R., & Romero, C. (2018). A survey on educational process mining. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(1), e1230.
Burgos, C., Campanario, M. L., de la Peña, D., Lara, J. A., Lizcano, D., & Martínez, M. A. (2018). Data mining for modeling students’ performance: A tutoring action plan to prevent academic dropout. Computers and Electrical Engineering, 66, 541–556.
Chaplot, D. S., Rhim, E., & Kim, J. (2015). Predicting student attrition in MOOCs using sentiment analysis and neural networks. CEUR Workshop Proceedings, 1432(June), 7–12.
Chatti, M. A., Dyckhoff, A. L., Schroeder, U., & Thüs, H. (2012). A reference model for learning analytics. International Journal of Technology Enhanced Learning., 4(5/6), 318.
Conijn, R., Snijders, C., Kleingeld, A., & Matzat, U. (2017). Predicting student performance from LMS data: A comparison of 17 blended courses using moodle LMS. IEEE Transactions on Learning Technologies, 10(1), 17–29.
Cox, D. (1958). The regression analysis of binary sequences. Journal of the Royal Statistical Society. Series B (Methodological), 20(2), 215–242.
Doolittle, P. E., & Camp, W. G. (1999). Constructivism: The career and technical education perspective. Journal of Vocational and Technical Education., 16(1), 23–46.
Dragulescu, B. , Bucos, M. & Radu, V. (2015). CVLA: Integrating multiple analytics techniques in a custom Moodle report. In International conference on information and software technologies.
Fei, M., & Yeung, D. Y. (2016). Temporal models for predicting student dropout in massive open online courses. Proceedings 15th IEEE international conference on data mining workshop, ICDMW 2015 (pp. 256–263).
Greller, W., & Drachsler, H. (2012). Translating learning into numbers: A generic framework for learning analytics. Educational Technology and Society, 15(3), 42–57.
Hein, G. E. (1991). Constructivist learning theory. In International committee of museum educators conference.
Hone, K. S., & El Said, G. R. (2016). Exploring the factors affecting MOOC retention: A survey study. Computers and Education, 98, 157–168.
Jeni, L. A., Cohn, J. F., & De La Torre, F. (2013). Facing imbalanced data—Recommendations for the use of performance metrics. In 2013 Humaine association conference on affective computing and intelligent interaction (pp. 245–251).
Lipton, Z. C. (2015). A critical review of recurrent neural networks for sequence learning. arXiv:abs/1506.00019.
Luna, J. M., Castro, C., & Romero, C. (2017). MDM tool: A data mining framework integrated into Moodle. Computer Applications in Engineering Education, 25(1), 90–102.
Márquez-Vera, C., Cano, A., Romero, C., Noaman, A. Y. M., Mousa Fardoun, H., & Ventura, S. (2016). Early dropout prediction using data mining: A case study with high school students. Expert Systems, 33(1), 107–124.
Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. BBA Protein Structure, 405(2), 442–451.
Monllaó Olivé, D. , Huynh, D. Q., Reynolds, M., Dougiamas, M., & Wiese, D. (2018). A supervised learning framework for learning management systems. In Proceedings of the first international conference on data science, e-learning and information systems—Data ’18 (pp. 1–8). ACM.
Papert, B. S., & Harel, I. (1991). Situating constructionism. Constructionism, 36, 1–11.
Powers, D. (2011). Evaluation: From precision, recall and F-measure to ROC, informedness, markedness & correlation. Journal of Machine Learning Technologies, 2(1), 37–63.
Romero, C., & Ventura, S. (2007). Educational data mining: A survey from 1995 to 2005. Expert Systems with Applications, 33(1), 135–146.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1988). Learning representations by back-propagating errors. Cognitive Modeling, 5(3), 1.
Siemens, G., Gasevic, D., Haythornthwaite, C., Dawson, S. P., Shum, S., Ferguson, R., & Baker, R. S. J. D. (2011). Open learning analytics: An integrated & modularized platform (Technical Report).
Xing, W., Rui, G., Petakovic, E., & Goggins, S. (2015). Participation-based student final performance prediction model through interpretable Genetic programming: Integrating learning analytics, educational data mining and theory. Computers in Human Behaviour, 18(2), 110–128.
This research project was funded by Moodle Pty Ltd, and by the Australian government and The University of Western Australia through the Research Training Program (RTP). We thank Moodle HQ for providing the dataset used in this study. Special thanks for Helen Foster and Mary Cooch for setting up the MOOC and for running regular versions of the course. Also thanks to all Moodle HQ staff and members of the Moodle community that participated in the project by doing code reviews, by testing the framework and by helping design the user interface of the tool.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Monllaó Olivé, D., Huynh, D.Q., Reynolds, M. et al. A supervised learning framework: using assessment to identify students at risk of dropping out of a MOOC. J Comput High Educ 32, 9–26 (2020). https://doi.org/10.1007/s12528-019-09230-1
- Learning management systems
- Learning analytics
- Educational data mining
- Machine learning
- Neural networks