Skip to main content
Log in

CNN autoencoders and LSTM-based reduced order model for student dropout prediction

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In recent years, Massive Open Online Courses (MOOCs) have become the main online learning method for students all over the world, but their development has been affected by the high dropout rate for a long time. Therefore, dropout prediction is a vital task for early teaching intervention and user retention. The students’ learning records are stored in MOOCs, which contain high-dimensional time series features. However, these features are hard to process, and the nonlinear relationship between the features is difficult to learn. These limitations have become obstacles to improve the performance in dropout prediction. In this paper, we propose a new neural dimension-reduced dropout prediction model based on neural network model to address the limitations. The proposed model, called CNNAE-LSTM, is constructed by convolutional neural network autoencoder (CNNAE) and long short-term memory neural network (LSTM). Specifically, CNNAE-LSTM compresses the students’ learning features into a low-dimensional latent space for reconstruction through CNNAE, then projects the latent space, retains the representative features in the learning records, and finally minimizes the reconstruction error to obtain the nonlinear relationship between features and dropout. The introduced LSTM neural network can obtain the time evolution of its latent vector. Our experiments on the KDD CUP 2015 dataset and the real-world dataset XuetangX demonstrate that the proposed model exhibits better predictive performance compared to the state-of-the-art baseline methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The data used in this research is available on reasonable request.

Notes

  1. http://moocdata.cn/data/user-activity.

References

  1. Mubarak AA, Cao H, Hezam IM (2021) Deep analytic model for student dropout prediction in massive open online courses. Comput Electr Eng 93:107271. https://doi.org/10.1016/j.compeleceng.2021.107271

    Article  Google Scholar 

  2. Fu Q, Gao Z, Zhou J et al (2021) CLSA: a novel deep learning model for MOOC dropout prediction. Comput Electr Eng 94:107315. https://doi.org/10.1016/j.compeleceng.2021.107315

    Article  Google Scholar 

  3. Sinha T, Jermann P, Li N, et al. (2014) Your click decides your fate: Inferring information processing and attrition behavior from mooc video clickstream interactions. arXiv:1407.7131

  4. Feng W, Tang J, Liu TX (2019) Understanding dropouts in MOOCs. Proc AAAI Conf Artif Intell 33(01):517–524.

    Google Scholar 

  5. Qi Y, Wu Q, Wang H, et al. (2018) Bandit learning with implicit feedback. Adv Neural Inf Process Syst 31.

  6. Mourdi Y, Sadgal M, El Kabtane H et al (2019) A machine learning-based methodology to predict learners’ dropout, success or failure in MOOCs. Int J Web Inf Syst. https://doi.org/10.1108/IJWIS-11-2018-0080

    Article  Google Scholar 

  7. Zhang J, Gao M, Zhang J (2021) The learning behaviours of dropouts in MOOCs: a collective attention network perspective. Comput Edu 167:104189. https://doi.org/10.1016/j.compedu.2021.104189

    Article  Google Scholar 

  8. Moreno-Marcos PM, Munoz-Merino PJ, Maldonado-Mahauad J et al (2020) Temporal analysis for dropout prediction using self-regulated learning strategies in self-paced MOOCs. Comput Edu 145:103728.1-103728.15. https://doi.org/10.1016/j.compedu.2019.103728

    Article  Google Scholar 

  9. Hung JL, Wang MC, Wang S et al (2015) Identifying at-risk students for early interventions-A time-series clustering approach. IEEE Trans Emerg Topics Comput 5(1):45–55. https://doi.org/10.1109/ACCESS.2020.3045157

    Article  Google Scholar 

  10. Jin C (2021) Dropout prediction model in MOOC based on clickstream data and student sample weight. Soft Computing 25(1):8971–8988. https://doi.org/10.1007/s00500-021-05795-1

    Article  Google Scholar 

  11. Zheng Y, Gao Z, Wang Y et al (2020) MOOC dropout prediction using FWTS-CNN model based on fused feature weighting and time series. IEEE Access 8:225324–225335. https://doi.org/10.1109/ACCESS.2020.3045157

    Article  Google Scholar 

  12. Wang G, Tang Y, Li J et al (2018) Modeling student learning Behaviors in ALEKS: A two-layer hidden Markov modeling approach. Int Conf Artif Intell Edu. https://doi.org/10.1007/978-3-319-93846-2_70

    Article  Google Scholar 

  13. Shaleena KP, Paul S (2015) Data mining techniques for predicting student performance. 2015 IEEE international conference on engineering and technology (ICETECH). IEEE:1-3. https://doi.org/10.1109/ICETECH.2015.7275025

  14. Sinha T, Jermann P, Li N, et al. (2014) Your click decides your fate: Inferring information processing and attrition behavior from mooc video clickstream interactions. arXiv:1407.7131

  15. Taylor C, Veeramachaneni K, O’Reilly UM. Likely to stop? predicting stopout in massive open online courses. arXiv:1408.3382

  16. Qiu L, Liu Y, Hu Q et al (2019) Student dropout prediction in massive open online courses by convolutional neural networks. Soft Computing 23(20):10287–10301. https://doi.org/10.1007/s00500-018-3581-3

    Article  Google Scholar 

  17. Sun D, Mao Y, Du J, et al (2019) Deep learning for dropout prediction in MOOCs. Eighth International Conference on Educational Innovation through Technology (EITT). IEEE:87-90. https://doi.org/10.1109/EITT.2019.00025

  18. Fei M, Yeung DY (2015) Temporal models for predicting student dropout in massive open online courses. 2015 IEEE International Conference on Data Mining Workshop (ICDMW). IEEE:256-263. https://doi.org/10.1109/ICDMW.2015.174

  19. Chen M, Wu L (2021) A dropout prediction method based on time series model in MOOCs. J Phys Conf Series. 1774(1):012065. https://doi.org/10.1088/1742-6596/1774/1/012065

    Article  Google Scholar 

  20. Xing W, Du D (2019) Dropout Prediction in MOOCs: Using Deep Learning for Personalized Intervention. J Edu Comput Res. 57(3):547–570. https://doi.org/10.1177/0735633118757015

    Article  Google Scholar 

  21. Xu C, Zhu G, Ye J et al (2022) Educational data mining: dropout prediction in XuetangX MOOCs. Neural Processing Lett 54:2885–2900. https://doi.org/10.1007/s11063-022-10745-5

    Article  Google Scholar 

  22. Blundo C, Fenza G, Fuccio G et al (2022) A time-driven FCA-based approach for identifying students’ dropout in MOOCs. Int J Intell Syst 37(4):2683–2705. https://doi.org/10.1002/int.22414

    Article  Google Scholar 

  23. Sharkey M, Sanders R (2014) A process for predicting MOOC attrition. Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs 50–54. https://doi.org/10.3115/v1/W14-4109

  24. Baranyi M, Nagy M, Molontay R (2020) Interpretable deep learning for university dropout prediction. Proceedings of the 21st annual conference on information technology education. pp 13–19. https://doi.org/10.1145/3368308.3415382

  25. Liu H, Zhu Y, Zang T et al (2021) Jointly modeling heterogeneous student behaviors and interactions among multiple prediction tasks. ACM Trans Knowl Discov Data (TKDD) 16(1):1–24. https://doi.org/10.1145/3458023

    Article  Google Scholar 

  26. Wen Y, Tian Y, Wen B et al. (2020) Consideration of the local correlation of learning behaviors to predict dropouts from MOOCs. inTsinghua Sci Technol 25(3), 336–347. https://doi.org/10.26599/TST.2019.9010013

  27. Xie Y (2021) Student performance prediction via attention-based multi-layer long-short term memory. J Comput Commun 9(8):61–79.

    Article  Google Scholar 

  28. Niu K, Jia B, Zhou Y et al (2022) A hybrid model for predicting academic performance of engineering undergraduates. Int J Model Simul Sci Comput 14:1793–9623.

    Google Scholar 

  29. Kloft M, Stiehler F, Zheng Z, et al. (2014) Predicting MOOC dropout over weeks using machine learning methods. Proceedings of the EMNLP 2014 workshop on analysis of large scale social interaction in MOOCs 60-65. https://doi.org/10.3115/v1/W14-4111

  30. He J, Bailey J, Rubinstein B et al (2015) Identifying at-risk students in massive open online courses. Proc AAAI Conf Artif Intell. https://doi.org/10.1609/aaai.v29i1.9471

    Article  Google Scholar 

  31. Liu T, Li X (2017) Finding out reasons for low completion in MOOC environment: an explicable approach using hybrid data mining methods. 2017 International Conference on Modern Education and Information Technology (MEIT 2017) 376-384. https://doi.org/10.12783/dtssehs/meit2017/12893

  32. Qiu L, Liu Y, Hu Q et al (2019) Student dropout prediction in massive open online courses by convolutional neural networks. Soft Computing 23(20):10287–10301. https://doi.org/10.1007/s00500-018-3581-3

    Article  Google Scholar 

  33. Salekshahrezaee LJ, Khoshgoftaar T(2021) Feature Extraction for Class Imbalance Using a Convolutional Autoencoder and Data Sampling. 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI) pp. 217-223. https://doi.org/10.1109/ICTAI52525.2021.00037

  34. Khozeimeh F, Sharifrazi D, Izadi NH et al (2021) Combining a convolutional neural network with autoencoders to predict the survival chance of COVID-19 patients. Sci Rep. 11:15343. https://doi.org/10.1038/s41598-021-93543-8

    Article  Google Scholar 

  35. Shou Z, Chen P, Wen H et al (2022) MOOC Dropout Prediction Based on Multidimensional Time-Series Data. Math Prob Eng. https://doi.org/10.1155/2022/2213292

    Article  Google Scholar 

  36. Wu N, Zhang L, Gao Y et al (2019) CLMS-Net: dropout prediction in MOOCs with deep learning. Proceedings of the ACM Turing Celebration Conference-China 1–6. https://doi.org/10.1145/3321408.3322848

  37. Ketkar N (2017) Convolutional neural networks. Springer International Publishing. https://doi.org/10.1007/978-1-4842-5364-9_6

    Book  Google Scholar 

  38. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507. https://doi.org/10.1126/science.1127647

    Article  MathSciNet  MATH  Google Scholar 

  39. Niu K, Guo Z, Peng X et al (2022) P-ResUnet: Segmentation of brain tissue with Purified Residual Unet. Comput Biol Med 151(Pt B):106294. https://doi.org/10.1016/j.compbiomed.2022.106294

    Article  Google Scholar 

  40. Lu W, Yu R, Wang S et al (2021) Sentence Semantic Matching Based on 3D CNN for Human-Robot Language Interaction. ACM Trans Int Technol 21(4):1–24. https://doi.org/10.1145/3450520

    Article  Google Scholar 

  41. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  42. Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324

    Article  MATH  Google Scholar 

  43. Zhang H (2004) The optimality of naive Bayes. Aa 1(2):3.

    Google Scholar 

  44. Dey R, Salem FM. Gate-variants of gated recurrent unit (GRU) neural networks (2017) IEEE 60th international midwest symposium on circuits and systems (MWSCAS). IEEE 1597–1600.

  45. Khozeimeh F, Sharifrazi D, Izadi NH et al (2021) Combining a convolutional neural network with autoencoders to predict the survival chance of COVID-19 patients. Sci Rep 11(1):1–18. https://doi.org/10.1038/s41598-021-93543-8

    Article  Google Scholar 

  46. Hasegawa K, Fukami K, Murata T et al (2020) CNN-LSTM based reduced order modeling of two-dimensional unsteady flows around a circular cylinder at different Reynolds numbers. Fluid Dyn Res 52(6):065501. https://doi.org/10.1088/1873-7005/abb91d

    Article  MathSciNet  Google Scholar 

  47. Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is all you need. Adv Neural Inf Process Syst, 30.

Download references

Acknowledgements

This work was supported by the Beijing Educational Science Planning Project under grant no. CHCA2020102.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ke Niu.

Ethics declarations

Conflict of interest

The authors declare no potential conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Niu, K., Lu, G., Peng, X. et al. CNN autoencoders and LSTM-based reduced order model for student dropout prediction. Neural Comput & Applic 35, 22341–22357 (2023). https://doi.org/10.1007/s00521-023-08894-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08894-2

Keywords

Navigation