Exploring N-gram Features in Clickstream Data for MOOC Learning Achievement Prediction

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10179)

Abstract

MOOC is an emerging online educational model in recent years. With the development of big data technology, a huge amount of learning behavior data can be mined by MOOC platforms. Mining learners’ past clickstream data to predict their future learning achievement by machine learning technology has become a hot research topic recently. Previous methods only consider the static counting-based features and ignore the correlative, temporal and fragmented nature of MOOC learning behavior, and thus have the limitation in interpretability and prediction accuracy. In this paper, we explore the effectiveness of N-gram features in clickstream data and model the MOOC learning achievement prediction problem as a multiclass classification task which classifies learners into four achievement levels. With extensive experiments on four real-world MOOC datasets, we empirically demonstrate that our methods outperform the state-of-the-art methods significantly.

Keywords

MOOC Educational data mining Learning analytics Machine learning 

References

  1. 1.
    Jiang, Z., Zhang, Y., Li, X.: Learning behavior analysis and prediction based on MOOC data. J. Comput. Res. Dev. 52(3), 614–628 (2015). (in Chinese)Google Scholar
  2. 2.
    Whitehill, J., Williams, J., Lopez, G., Coleman, C., Reich, J.: Beyond prediction: first steps toward automatic intervention in MOOC student stopout. In: Proceedings of the 8th International Conference on Educational Data Mining, pp. 222–230. EDM Press, Madrid (2015)Google Scholar
  3. 3.
    Xing, W., Chen, X., Stein, J., Marcinkowskid, M.: Temporal predication of dropouts in MOOCs: reaching the low hanging fruit through stacking generalization. Comput. Hum. Behav. 58, 119–129 (2016)CrossRefGoogle Scholar
  4. 4.
    Qiu, J., Tang, J., Liu, T., Gong, J., Zhang, C., Zhang, Q., Xue, Y.: Modeling and predicting learning behavior in MOOCs. In: Proceedings of the 9th ACM International Conference on Web Search and Data Mining, pp. 93–102. ACM Press, San Francisco (2015)Google Scholar
  5. 5.
    Brooks, C., Thompson, C., Teasley, S.: A time series interaction analysis method for building predictive models of learners using log data. In: Proceedings of the 5th International Conference on Learning Analytics and Knowledge, pp. 126–135. ACM Press, Poughkeepsie (2015)Google Scholar
  6. 6.
    Robinson, C., Yeomans, M., Reich, J., Hulleman, C., Gehlbach, H.: Forecasting student achievement in MOOCs with natural language processing. In: Proceedings of the 6th International Conference on Learning Analytics and Knowledge, pp. 383–387. ACM Press, Edinburgh (2016)Google Scholar
  7. 7.
    He, J., Bailey, J., Rubinstein, I., Zhang, R.: Identifying at-risk students in massive open online courses. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 1749–1755. AAAI Press, Austin (2015)Google Scholar
  8. 8.
    Chaplot, D., Rhim, E., Kim, J.: Predicting student attrition in moocs using sentiment analysis and neural networks. In: Proceedings of AIED 2015 Fourth Workshop on Intelligent Support for Learning in Groups, pp. 7–12. AIED Press, Madrid (2015)Google Scholar
  9. 9.
    Crossley, S., Paquette, L., Dascalu, M., McNamara, D., Baker, R.: Combining click-stream data with NLP tools to better understand MOOC completion. In: Proceedings of the 6th International Conference on Learning Analytics and Knowledge, pp. 6–14. ACM, Edinburgh (2016)Google Scholar
  10. 10.
    Zhou, Q., Mou, C., Yang, D.: Research Progress on Educational Data Mining A Survey. J. Softw. 26(11), 3026–3042 (2015). (in Chinese)Google Scholar
  11. 11.
    Xing, Z., Pei, J., Keogh, E.: A brief survey on sequence classification. SIGKDD Explor. 12, 40–48 (2010)CrossRefGoogle Scholar
  12. 12.
    Romero, C., Ventura, S.: Educational data mining: a review of the state of the art. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 40(6), 601–618 (2010)CrossRefGoogle Scholar
  13. 13.
    Jiang, S., Williams, A., Schenke, K., Warschauer, M., Dowd, D.: Predicting MOOC performance with week 1 behavior. In: Proceedings of the 7th International Conference on Educational Data Mining, pp. 273–275. EDM Press, London (2014)Google Scholar
  14. 14.
    Kumar, M., Kan, M., Tan, B.: Learning instructor intervention from MOOC forums: early results and issues. In: Proceedings of the 8th International Conference on Educational Data Mining, pp. 218–225. EDM Press, Madrid (2015)Google Scholar
  15. 15.
    Yang, D., Piergallini, M., Howley, I., Rose, C.: Forum thread recommendation for massive open online courses. In: Proceedings of the 7th International Conference on Educational Data Mining, pp. 257–260. EDM Press, London (2014)Google Scholar
  16. 16.
    Davis, D., Chen, G., Hauff, C., Houben, G.: Gauging MOOC learners’ adherence to the designed learning path. In: Proceedings of the 9th International Conference on Educational Data Mining, pp. 54–61. EDM Press, Raleigh (2016)Google Scholar
  17. 17.
    Ren, Z., Rangwala, H., Johri, A.: Predicting performance on MOOC assessments using multi-regression models. In: Proceedings of the 9th International Conference on Educational Data Mining, pp. 484–489. EDM Press, Raleigh (2016)Google Scholar
  18. 18.
    Kennedy, G., Coffrin, C., Barba, P., Corrin, L.: Predicting success how learners’ prior knowledge, skills and activities predict MOOC performance. In: Proceedings of the 5th International Conference on Learning Analytics and Knowledge, pp. 136–140. ACM Press, Poughkeepsie (2015)Google Scholar
  19. 19.
    Sanchez-Santillan, M., Cerezo, R., Paule-Ruiz, M., Nuñez, J.: Predicting students’ performance: incremental interaction classifiers. In: Proceedings of the Third ACM Conference on Learning @ Scale, pp. 217–220. ACM Press, Edinburgh (2016)Google Scholar
  20. 20.
    Peter, J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput. Appl. Math. 20, 53–65 (1987)CrossRefMATHGoogle Scholar
  21. 21.
    Tong, Y., She, J., Meng, R.: Bottleneck-aware arrangement over event-based social networks: the max-min approach. World Wide Web J. 19(6), 1151–1177 (2016)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Information CenterNational University of Defense TechnologyChangshaChina
  2. 2.College of ComputerNational University of Defense TechnologyChangshaChina
  3. 3.National Key Laboratory of Parallel and Distributed ProcessingNational University of Defense TechnologyChangshaChina

Personalised recommendations