Are MOOC Learning Analytics Results Trustworthy? With Fake Learners, They Might Not Be!


The rich data that Massive Open Online Courses (MOOCs) platforms collect on the behavior of millions of users provide a unique opportunity to study human learning and to develop data-driven methods that can address the needs of individual learners. This type of research falls into the emerging field of learning analytics. However, learning analytics research tends to ignore the issue of the reliability of results that are based on MOOCs data, which is typically noisy and generated by a largely anonymous crowd of learners. This paper provides evidence that learning analytics in MOOCs can be significantly biased by users who abuse the anonymity and open-nature of MOOCs, for example by setting up multiple accounts, due to their amount and aberrant behavior. We identify these users, denoted fake learners, using dedicated algorithms. The methodology for measuring the bias caused by fake learners’ activity combines the ideas of Replication Research and Sensitivity Analysis. We replicate two highly-cited learning analytics studies with and without fake learners data, and compare the results. While in one study, the results were relatively stable against fake learners, in the other, removing the fake learners’ data significantly changed the results. These findings raise concerns regarding the reliability of learning analytics in MOOCs, and highlight the need to develop more robust, generalizable and verifiable research methods.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Change history

  • 13 January 2020

    In this issue, the citation information on the opening page of each article HTML was updated to read “International Journal of Artificial Intelligence in Education December 2019…,” not “International Journal of Artificial Intelligence in Education December 2000...”


  1. 1.

  2. 2.

  3. 3.

  4. 4.

  5. 5.

    source code:

  6. 6.

  7. 7.

  8. 8.


  1. Alexandron, G., Ruipérez-Valiente, J.A., Pritchard, D.E. (2015a). Evidence of MOOC students using multiple accounts to harvest correct answers. Learning with MOOCs II, 2015.

  2. Alexandron, G., Zhou, Q., Pritchard, D. (2015b). Discovering the pedagogical resources that assist students in answering questions correctly – a machine learning approach. In Proceedings of the 8th international conference on educational data mining (pp. 520–523).

  3. Alexandron, G., Ruipėrez-Valiente, J.A., Chen, Z., Muñoz-Merino, P.J., Pritchard, D.E. (2017). Copying@Scale using harvesting accounts for collecting correct answers in a MOOC. Communication Education, 108, 96–114.

    Google Scholar 

  4. Alexandron, G., Ruipérez-Valiente, J.A., Lee, S., Pritchard, D.E. (2018). Evaluating the robustness of learning analytics results against fake learners. In Proceedings of the thirteenth European conference on technology enhanced learning: Springer.

  5. Alexandron, G., Ruipérez-Valiente, J.A., Pritchard, D.E. (2019). Towards a general purpose anomaly detection method to identify cheaters in massive open online courses. In Proceedings of the 12th international conference on educational data mining.

  6. Baker, R., Walonoski, J., Heffernan, N., Roll, I., Corbett, A., Koedinger, K. (2008). Why students engage in “gaming the system” behavior in interactive learning environments. Journal of Interactive Learning Research, 19(2), 162–182.

    Google Scholar 

  7. Champaign, J., Colvin, K.F., Liu, A., Fredericks, C., Seaton, D., Pritchard, D.E. (2014). Correlating skill and improvement in 2 MOOCs with a student’s time on tasks. In Proceedings of the first ACM conference on Learning @ scale conference - L@S ’14, (March): 11–20.

  8. Chen, Z., Chudzicki, C., Palumbo, D., Alexandron, G., Choi, Y.-J., Zhou, Q., Pritchard, D.E. (2016). Researching for better instructional methods using AB experiments in MOOCs: results and challenges. Research and Practice in Technology Enhanced Learning, 11(1), 9.

    Article  Google Scholar 

  9. De Ayala, R. (2009). The theory and practice of item response theory. Methodology in the social sciences. Guilford Publications.

  10. Donders, A.R.T., Van Der Heijden, G.J., Stijnen, T., Moons, K.G. (2006). A gentle introduction to imputation of missing values. Journal of Clinical Epidemiology, 59(10), 1087–1091.

    Article  Google Scholar 

  11. Du, X., Duivesteijn, W., Klabbers, M., Pechenizkiy, M. (2018). Elba: exceptional learning behavior analysis. In Educational data mining (pp. 312–318).

  12. Gardner, J., Brooks, C., Andres, J.M.L., Baker, R. (2018). Morf: a framework for MOOC predictive modeling and replication at scale. arXiv:1801.05236.

  13. Goldhammer, F. (2015). Measuring ability, speed, or both? challenges, psychometric solutions, and what can be gained from experimental control. Measurement: Interdisciplinary Research and Perspectives, 13(3-4), 133–164.

    Google Scholar 

  14. Hastie, T., Tibshirani, R., Friedman, J. (2001). The elements of statistical learning. Springer series in statistics. New York: Springer.

    Google Scholar 

  15. Hodge, V., & Austin, J. (2004). A survey of outlier detection methodologies. Artificial Intelligence Review, 22(2), 85–126.

    Article  Google Scholar 

  16. Kiernan, M., Kraemer, H.C., Winkleby, M.A., King, A.C., Taylor, C.B. (2001). Do logistic regression and signal detection identify different subgroups at risk? implications for the design of tailored interventions. Psychological Methods, 6(1), 35.

    Article  Google Scholar 

  17. Kim, J., Guo, P.J., Cai, C.J., Li, S.-W.D., Gajos, K.Z., Miller, R.C. (2014a). Data-driven interaction techniques for improving navigation of educational videos. In Proceedings of the 27th annual ACM symposium on user interface software and technology - UIST’14 (pp. 563–572).

  18. Kim, J., Guo, P.J., Seaton, D.T., Mitros, P., Gajos, K.Z., Miller, R.C. (2014b). Understanding in-video dropouts and interaction peaks in online lecture videos.

  19. Koedinger, K.R., Mclaughlin, E.A., Kim, J., Jia, J.Z., Bier, N.L. (2015). Learning is not a spectator sport doing is better than watching for learning from a MOOC, pp. 111–120.

  20. Krause, J., Perer, A., Ng, K. (2016). Interacting with predictions: visual inspection of black-box machine learning models. In Proceedings of the 2016 CHI conference on human factors in computing systems (pp. 5686–5697): ACM.

  21. Kyllonen, P., & Zu, J. (2016). Use of response time for measuring cognitive ability. Journal of Intelligence, 4(4), 14.

    Article  Google Scholar 

  22. Lazer, D., Kennedy, R., King, G., Vespignani, A. (2014). The parable of google flu: traps in big data analysis. Science, 343(6176), 1203–1205.

    Article  Google Scholar 

  23. Luna, J. M., Castro, C., Romero, C. (2017). Mdm tool: a data mining framework integrated into moodle. Computer Applications in Engineering Education, 25(1), 90–102.

    Article  Google Scholar 

  24. MacHardy, Z., & Pardos, Z.A. (2015). Toward the evaluation of educational videos using bayesian knowledge tracing and big data. In Proceedings of the second (2015) ACM conference on learning @ scale, L@S ’15 (pp. 347–350): ACM.

  25. MacKinnon, J.G. (2009). Bootstrap hypothesis testing, chapter 6, pp. 183–213. John Wiley & Sons, Ltd.

  26. Meyer, J.P., & Zhu, S. (2013). Fair and equitable measurement of student learning in moocs: an introduction to item response theory, scale linking, and score equating. Research & Practice in Assessment, 8, 26–39.

    Google Scholar 

  27. Müller, O., Junglas, I., Brocke, J.V., Debortoli, S. (2016). Utilizing big data analytics for information systems research: challenges, promises and guidelines. European Journal of Information Systems, 25(4), 289–302.

    Article  Google Scholar 

  28. Northcutt, C.G., Ho, A.D., Chuang, I.L. (2016). Detecting and preventing “multiple-account” cheating in massive open online courses. Computers in Education, 100(C), 71–80.

    Article  Google Scholar 

  29. O’Neil, C. (2017). Weapons of math destruction: how big data increases inequality and threatens democracy. Broadway Books.

  30. Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251). ISSN 0036-8075.

  31. Pardo, A., Mirriahi, N., Martinez-Maldonado, R., Jovanovic, J., Dawson, S., Gašević, D. (2016). Generating actionable predictive models of academic performance. In Proceedings of the sixth international conference on learning analytics & knowledge (pp. 474–478): ACM.

  32. Pardos, Z.A., Tang, S., Davis, D., Le, C.V. (2017). Enabling real-time adaptivity in MOOCs with a personalized next-step recommendation framework. In Proceedings of the fourth (2017) ACM conference on learning @ scale - L@S ’17. ISBN 9781450344500.

  33. Perez, S., Massey-Allard, J., Butler, D., Ives, J., Bonn, D., Yee, N., Roll, I. (2017). Identifying productive inquiry in virtual labs using sequence mining. In André, E., Baker, R., Hu, X., Rodrigo, M.M.T., du Boulay, B. (Eds.) Artificial intelligence in education (pp. 287–298). Cham: Springer International Publishing.

    Google Scholar 

  34. Qiu, J., Tang, J., Liu, T. X., Gong, J., Zhang, C., Zhang, Q., Xue, Y. (2016). Modeling and predicting learning behavior in moocs. In Proceedings of the ninth ACM international conference on web search and data mining (pp. 93–102): ACM.

  35. Reich, J., & Ruipérez-Valiente, J.A. (2019). The MOOC pivot. Science, 363 (6423), 130–131.

    Article  Google Scholar 

  36. Romero, C., & Ventura, S. (2017). Educational data science in massive open online courses. Wiley interdisciplinary reviews: data mining and knowledge discovery, WIREs Data Mining Knowl Discov, 01.

    Google Scholar 

  37. Rosen, Y., Rushkin, I., Ang, A., Federicks, C., Tingley, D., Blink, M.J. (2017). Designing adaptive assessments in MOOCs. In Proceedings of the fourth (2017) ACM conference on learning @ scale, L@S ’17. ISBN 978-1-4503-4450-0 (pp. 233–236).

  38. Ruipérez-Valiente, J.A., Alexandron, G., Chen, Z., Pritchard, D.E. (2016). Using multiple accounts for harvesting solutions in MOOCs. In Proceedings of the third (2016) ACM conference on learning @ scale - L@S ’16 (pp. 63–70).

  39. Ruipérez-Valiente, J.A., Joksimović, S., Kovanović, V., Gašević, D., Muñoz Merino, P.J., Delgado Kloos, C. (2017a). A data-driven method for the detection of close submitters in online learning environments. In Proceedings of the 26th international conference on world wide web companion (pp. 361–368).

  40. Ruipérez-Valiente, J.A., Muñoz-Merino, P.J., Alexandron, G., Pritchard, D.E. (2017b). Using machine learning to detect ‘multiple-account’ cheating and analyze the influence of student and problem features. IEEE Transactions on Learning Technologies, 14(8), 1–11.

    Google Scholar 

  41. Ruipérez-Valiente, J.A., Muñoz-Merino, P.J., Gascón-Pinedo, J.A., Kloos, C.D. (2017c). Scaling to massiveness with ANALYSE: a learning analytics tool for Open edX. IEEE Transactions on Human-Machine Systems, 47(6), 909–914.

    Article  Google Scholar 

  42. Saltelli, A., Chan, K., Scott, E.M., et al. (2000). Sensitivity analysis Vol. 1. New York: Wiley.

    Google Scholar 

  43. Seshia, S.A., & Sadigh, D. (2016). Towards verified artificial intelligence. CoRR, arXiv:1606.08514, .

  44. Siemens, G. (2013). Learning analytics: the emergence of a discipline. American Behavioral Scientist, 57(10), 1380–1400.

    Article  Google Scholar 

  45. Silver, N. (2012). The signal and the noise: why so many predictions fail–but some don’t. Penguin.

  46. U.S. Department of Education, Office of Educational Technology. (2012). Enhancing teaching and learning through educational data mining and learning analytics: an issue brief.

  47. van der Zee, T., & Reich, J. (2018). Open education science. AERA Open, 4 (3), 2332858418787466.

    Google Scholar 

  48. Xing, W., Chen, X., Stein, J., Marcinkowski, M. (2016). Temporal predication of dropouts in moocs Reaching the low hanging fruit through stacking generalization. Comput. Hum. Behav., 58, 119–129.

    Article  Google Scholar 

  49. Yudelson, M., Fancsali, S., Ritter, S., Berman, S., Nixon, T., Joshi, A. (2014). Better data beats big data. In Educational data mining 2014.

Download references


GA’s research is supported by the Israeli Ministry of Science and Technology under project no. 713257.

Author information



Corresponding author

Correspondence to Giora Alexandron.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Appendix 1: Sampling Distribution of Mean IRT Ability

Fig. 7

Sampling Distribution of Mean IRT Ability for a Doers who are neither Watchers nor Readers b Doers who are also Watchers. The dashed vertical lines mark the 95% confidence interval, and the vertical blue lines mark the mean value without fake learners

Appendix 2: Replication Study 1 with 2X Simulated Fake Learners

Fig. 8

IRT Results a With simulated 2x fake learners b Original data (All learners); c true learners (fake learners removed)0

Fig. 9

Tetrad Results a With simulated 2x fake learners b Original data (all learners); c true learners (fake learners removed)

Appendix 3: Figures from Original Papers

Fig. 10

Original figures from Koedinger et al. (2015) a Final grade by learner type b Causal model generated by Tetrad

Fig. 11

Original figure from Champaign et al. (2014)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Alexandron, G., Yoo, L.Y., Ruipérez-Valiente, J.A. et al. Are MOOC Learning Analytics Results Trustworthy? With Fake Learners, They Might Not Be!. Int J Artif Intell Educ 29, 484–506 (2019).

Download citation


  • Learning Analytics
  • MOOCs
  • Replication research
  • Sensitivity analysis
  • Fake learners