Skip to main content
Log in

Learning behavior feature fused deep learning network model for MOOC dropout prediction

  • Published:
Education and Information Technologies Aims and scope Submit manuscript

Abstract

Massive open online courses (MOOCs) have become one of the most popular ways of learning in recent years due to their flexibility and convenience. However, high dropout rate has become a prominent problem that hinders the further development of MOOCs. Therefore, the prediction of student dropouts is the key to further enhance the MOOCs platform. The traditional dropout prediction models based on machine learning are difficult to guarantee the prediction effect due to the shortcomings such as insufficient mining of feature information and not considering the influence of time series. To address this problem, in this paper, we propose the learning behavior feature fused deep learning network model (LBDL) for MOOC dropout prediction. The core of the model lies in modeling different types of information separately and incorporating them into an overall framework. In the data processing stage, the LBDL model divides the data features into video learning behavior features containing time series information and general information features. For video learning behavior features, the model uses Bi-LSTM and attention mechanisms to mine time series information, and for general information features, it uses embedding layer and fully connected layer for processing. A hidden vector containing both types of feature information can be obtained by two different modeling approaches. Then the original feature information is combined to train the gradient boosting framework LightGBM. Experiments on the MOOCCube video dataset show that the AUC and F1-Score of our model can reach 82.39% and 74.89%, respectively, which are higher than other baseline models. It indicates that the proposed LBDL model has better performance in the dropout rate prediction problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

Data sets used during the current study are available from the appropriate authors upon reasonable request.

References

  • Amnueypornsakul, B., Bhat, S., & Chinprutthiwong, P. (2014). Predicting attrition along the way: The UIUC model. In Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs (pp. 55–59). Doha, Qatar: Association for Computational Linguistics. https://doi.org/10.3115/v1/W14-4110

  • Baranyi, M., & Moontay, R. (2020, October). Interpretable deep learning for university dropout prediction. In Proceedings of the 21st annual conference on information technology education (pp. 13–19). https://doi.org/10.1145/3368308.3415382

  • Basnet, R. B., Johnson, C., & Doleck, T. (2022). Dropout prediction in Moocs using deep learning and machine learning. Education and Information Technologies, 27(8), 11499–11513. https://doi.org/10.1007/s10639-022-11068-7

    Article  Google Scholar 

  • Brahimi, T., & Sarirete, A. (2015). Learning outside the classroom through MOOCs. Computers in Human Behavior, 51, 604–609. https://doi.org/10.1016/j.chb.2015.03.013

    Article  Google Scholar 

  • Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

    Article  Google Scholar 

  • Cai, L., & Zhang, G. (2021). Prediction of MOOCs Dropout based on WCLSRT Model. In 2021 IEEE 5th Advanced Information Technology, Electronic and Automation ControliiConferencew(pp. 780–784). IEEE. https://doi.org/10.1109/IAEAC50856.2021.9390886

  • Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785–794). https://doi.org/10.1145/2939672.2939785

  • Chen, M., & Wu, L. A. (2021). A dropout prediction method based on time series model in MOOCs. Journal of Physics: Conferrence Series, 1774(1), 012065. https://doi.org/10.1088/1742-6596/1774/1/012065

    Article  Google Scholar 

  • Cui, H., Mittal, V. O., & Datar, M. (2006, July). Comparative experiments on sentiment classification for online product reviews. In proceedings of the 21st national conference on Artificial intelligence (Vol. 6, No. 1265–1270, p.30).

  • Dai, Z., Fu, J., Zhu, Q., Cui, H., Li, X., & Qi, Y. (2020). Local contextual attention with hierarchical structure for dialogue act recognition. arXiv preprint arXiv:2003.06044. https://doi.org/10.48550/arxiv.2003.06044

  • Dey, R., & Salemt, F. M. (2017, August). Gate-variants of gated recurrent unit (GRU) neural networks. In 2017 IEEE 60th international midwest symposium on circuits and systems (pp. 1597–1600).IEEE. https://doi.org/10.1109/MWSCAS.2017.8053243

  • Fei, M., & Yeung, D. Y. (2015, November). Temporal Models for Predicting Student Dropout in Massive Open Online Courses. In 2015 IEEE International Confe-rence on Data Mining Workshop (pp. 256–263). IEEE. https://doi.org/10.1109/ICDMW.2015.174

  • Feng, W., Tang, J., & Liu, T. X. (2019, July). Understanding Dropouts in MOOCs. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (Vol. 33, No. 01, pp. 517–524). https://doi.org/10.1609/aaai.v33i01.3301517

  • Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.

    Article  MathSciNet  Google Scholar 

  • Kim, Y., Denton, C., Hoang L., & Rush, A. M. (2017). Structured attention networks. arXiv preprint arXiv:1702.00887. https://doi.org/10.48550/arXiv.1702.00887

  • Kloft, M., Stiehler, F., Zheng, Z., & Pinkwart, P. (2014). Predicting mooc dropout over weeks using machine learning methods. In Proceedings of the EMNLP workshop on analysis of large scale social interaction in MOOCs (60–65). https://doi.org/10.3115/v1/W14-4111

  • Lakkaraju, H., Aguiar, E., & Shan, C. (2015). A machine learning framework to identify students at risk of adverse academic outcomes. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (1909–1918). https://doi.org/10.1145/2783258.2788620

  • Long, F., Zhou, K., & Ou, W. (2019). Sentiment Analysis of Text Based on Bidirectional LSTM With Multi-Head Attention. IEEE Access, 7, 141960–141969. https://doi.org/10.1109/ACCESS.2019.2942614

    Article  Google Scholar 

  • Mubarak, A. A., Cao, H., & Zhang, W. (2021). Visual analytics of video-clickstream data and prediction of learners’ performance using deep learning models in MOOCs’ courses. Computer Applications in Engineering Education, 29(4), 710–732. https://doi.org/10.1002/cae.22328

    Article  Google Scholar 

  • Pulikottil, S. C., & Gupta, M. (2020). ONet – A Temporal Meta Embedding Network for MOOC Dropout Prediction. In 2020 IEEE International Conference on Big Data (5209–5217). https://doi.org/10.1109/BigData50022.2020.9378001

  • Qi, M. (2017). LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Advances in neural information processing systems, 30.

  • Qiu, L., Liu, Y., Hu, Q., & Liu, Y. (2018). Student dropout prediction in massive open online courses by convolutional neural networks. Soft Computing, 23(20), 10287–10301. https://doi.org/10.1007/s00500-018-3581-3

    Article  Google Scholar 

  • Saber, A., Sakr, M., & Abo-Seida, O. M. (2021). A Novel Deep-Learning Model for Automatic Detection and Classification of Breast Cancer Using the Transfer-Learning Technique. IEEE Access, 9, 71194–71209. https://doi.org/10.1109/ACCESS.2021.3079204

    Article  Google Scholar 

  • Tang, C., Ouyang, Y., & Rong, W. (2018). Time series model for predicting dropout in massive open online courses. In Artificial Intelligence in Education: 19th International Conference, AIED 2018, London, UK, June 27–30, 2018, Proceedings, Part II 19 (pp. 353–357). Springer International Publishing. https://doi.org/10.1007/978-3-319-93846-2_66

  • Taylor, C., Veeramachaneni, K., & O’Reilly, U. M. (2014). Likely to stop? Predicting stopout in massive open online courses. arXiv preprint, 214, 118–33. https://doi.org/10.48550/arXiv.1408.3382

    Article  Google Scholar 

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30. https://doi.org/10.48550/arxiv.1706.03762

  • Wang, W., Yu, H., & Miao, C. (2017, July). Deep model for dropout prediction in MOOCs. In 2nd International Conference on Crowd Science and Engineering (pp. 26–32). https://doi.org/10.1145/3126973.3126990

  • Wen, Y., Tian, Y., Wen, B., Zhou, Q., Cai, G., & Liu, S. (2020). Consideration of the local correlation of learning behaviors to predict dropouts from MOOCs. Tsinghua Science and Technology, 25(3), 336–347. https://doi.org/10.26599/TST.2019.9010013

    Article  Google Scholar 

  • Whitehill, J., Mohan, K.V., Seaton, D.T., Rosen, Y., & Tingley, D. (2017). Delving Deeper into MOOC Student Dropout Prediction. arXiv preprint arXiv:1702.06404. https://doi.org/10.48550/arXiv.1702.06404

  • Wilson, K. H., Xiong, X., Khajah, M., Lindsey, R. V., Zhao, S., Karklin, Y., ... & Heffernan, N. (2016). Estimating student proficiency: Deep learning is not the panacea. In Neural Information Processing Systems, Workshop on Machine Learning for Education.

  • Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., & Hovy, E. (2016). Hierarchical attention networks for document classification. Proc. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (1480–1489). San Diego, California: Association for Computational Linguistics. https://doi.org/10.18653/v1/N16-1174

  • Youssef, M., Mohamed, S., Kabtane, H. E., & Wafaa, B. F. (2019). A machine learning-based methodology to predict learners’ dropout, success or failure in MOOCs. International Journal of Web Information Systems, 15(5), 489–509. https://doi.org/10.1108/IJWIS-11-2018-0080

    Article  Google Scholar 

  • Yu, J. (2020). MOOCCube: a large-scale data repository for NLP applications in MOOCs. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (3135–3142). Online: Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.285

  • Zheng, Y., Gao, Z., & Wang, Y. (2020). MOOC Dropout Prediction Using FWTS-CNN Model Based on Fused Feature Weighting and Time Series. IEEE Access, 8, 225324–225335. https://doi.org/10.1109/ACCESS.2020.3045157

    Article  Google Scholar 

Download references

Acknowledgements

All authors would like to thank all the participants who devoted their time and energy in taking part in the experiments. We thank the associate editor and the reviewers for their useful feedback that improved this paper.

Funding

This work is supported by the National Natural Science Foundation of China (Grant Nos. 62071379, 62071378, 61901365 and 62106196), the Natural Science Basic Research Plan in Shaanxi Province of China (Grant Nos. 2021JM-461 and 2020JM-299), the Fundamental Research Funds for the Central Universities (Grant No. GK202103085), and New Star Team of Xi'an University of Posts & Telecommunications (Grant No. xyt2016-01).

Author information

Authors and Affiliations

Authors

Contributions

Xiao Chen: designer of the paper model, responsible for data processing, experiment and writing the original draft. Hanqiang Liu: put forward his own suggestions and ideas on the paper model, participated in the revision of the manuscript and approved the final manuscript. Feng Zhao: provided suggestions on the experimental part of the model and participated in the revision of the manuscript.

Corresponding author

Correspondence to Xiao Chen.

Ethics declarations

Ethics approval

The participants were protected by hiding their personal information during the research process. They knew that their participation was voluntary and they could withdraw from the study at any time.

Confict of interest

The authors declare that they have no confict of interest.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, H., Chen, X. & Zhao, F. Learning behavior feature fused deep learning network model for MOOC dropout prediction. Educ Inf Technol 29, 3257–3278 (2024). https://doi.org/10.1007/s10639-023-11960-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10639-023-11960-w

Keywords

Navigation