The rich data that Massive Open Online Courses (MOOCs) platforms collect on the behavior of millions of users provide a unique opportunity to study human learning and to develop data-driven methods that can address the needs of individual learners. This type of research falls into the emerging field of learning analytics. However, learning analytics research tends to ignore the issue of the reliability of results that are based on MOOCs data, which is typically noisy and generated by a largely anonymous crowd of learners. This paper provides evidence that learning analytics in MOOCs can be significantly biased by users who abuse the anonymity and open-nature of MOOCs, for example by setting up multiple accounts, due to their amount and aberrant behavior. We identify these users, denoted fake learners, using dedicated algorithms. The methodology for measuring the bias caused by fake learners’ activity combines the ideas of Replication Research and Sensitivity Analysis. We replicate two highly-cited learning analytics studies with and without fake learners data, and compare the results. While in one study, the results were relatively stable against fake learners, in the other, removing the fake learners’ data significantly changed the results. These findings raise concerns regarding the reliability of learning analytics in MOOCs, and highlight the need to develop more robust, generalizable and verifiable research methods.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Alexandron, G., Ruipérez-Valiente, J.A., Pritchard, D.E. (2015a). Evidence of MOOC students using multiple accounts to harvest correct answers. Learning with MOOCs II, 2015.
Alexandron, G., Zhou, Q., Pritchard, D. (2015b). Discovering the pedagogical resources that assist students in answering questions correctly – a machine learning approach. In Proceedings of the 8th international conference on educational data mining (pp. 520–523).
Alexandron, G., Ruipėrez-Valiente, J.A., Chen, Z., Muñoz-Merino, P.J., Pritchard, D.E. (2017). Copying@Scale using harvesting accounts for collecting correct answers in a MOOC. Communication Education, 108, 96–114.
Alexandron, G., Ruipérez-Valiente, J.A., Lee, S., Pritchard, D.E. (2018). Evaluating the robustness of learning analytics results against fake learners. In Proceedings of the thirteenth European conference on technology enhanced learning: Springer.
Alexandron, G., Ruipérez-Valiente, J.A., Pritchard, D.E. (2019). Towards a general purpose anomaly detection method to identify cheaters in massive open online courses. In Proceedings of the 12th international conference on educational data mining.
Baker, R., Walonoski, J., Heffernan, N., Roll, I., Corbett, A., Koedinger, K. (2008). Why students engage in “gaming the system” behavior in interactive learning environments. Journal of Interactive Learning Research, 19(2), 162–182.
Champaign, J., Colvin, K.F., Liu, A., Fredericks, C., Seaton, D., Pritchard, D.E. (2014). Correlating skill and improvement in 2 MOOCs with a student’s time on tasks. In Proceedings of the first ACM conference on Learning @ scale conference - L@S ’14, (March): 11–20.
Chen, Z., Chudzicki, C., Palumbo, D., Alexandron, G., Choi, Y.-J., Zhou, Q., Pritchard, D.E. (2016). Researching for better instructional methods using AB experiments in MOOCs: results and challenges. Research and Practice in Technology Enhanced Learning, 11(1), 9.
De Ayala, R. (2009). The theory and practice of item response theory. Methodology in the social sciences. Guilford Publications.
Donders, A.R.T., Van Der Heijden, G.J., Stijnen, T., Moons, K.G. (2006). A gentle introduction to imputation of missing values. Journal of Clinical Epidemiology, 59(10), 1087–1091.
Du, X., Duivesteijn, W., Klabbers, M., Pechenizkiy, M. (2018). Elba: exceptional learning behavior analysis. In Educational data mining (pp. 312–318).
Gardner, J., Brooks, C., Andres, J.M.L., Baker, R. (2018). Morf: a framework for MOOC predictive modeling and replication at scale. arXiv:1801.05236.
Goldhammer, F. (2015). Measuring ability, speed, or both? challenges, psychometric solutions, and what can be gained from experimental control. Measurement: Interdisciplinary Research and Perspectives, 13(3-4), 133–164.
Hastie, T., Tibshirani, R., Friedman, J. (2001). The elements of statistical learning. Springer series in statistics. New York: Springer.
Hodge, V., & Austin, J. (2004). A survey of outlier detection methodologies. Artificial Intelligence Review, 22(2), 85–126.
Kiernan, M., Kraemer, H.C., Winkleby, M.A., King, A.C., Taylor, C.B. (2001). Do logistic regression and signal detection identify different subgroups at risk? implications for the design of tailored interventions. Psychological Methods, 6(1), 35.
Kim, J., Guo, P.J., Cai, C.J., Li, S.-W.D., Gajos, K.Z., Miller, R.C. (2014a). Data-driven interaction techniques for improving navigation of educational videos. In Proceedings of the 27th annual ACM symposium on user interface software and technology - UIST’14 (pp. 563–572).
Kim, J., Guo, P.J., Seaton, D.T., Mitros, P., Gajos, K.Z., Miller, R.C. (2014b). Understanding in-video dropouts and interaction peaks in online lecture videos.
Koedinger, K.R., Mclaughlin, E.A., Kim, J., Jia, J.Z., Bier, N.L. (2015). Learning is not a spectator sport doing is better than watching for learning from a MOOC, pp. 111–120.
Krause, J., Perer, A., Ng, K. (2016). Interacting with predictions: visual inspection of black-box machine learning models. In Proceedings of the 2016 CHI conference on human factors in computing systems (pp. 5686–5697): ACM.
Kyllonen, P., & Zu, J. (2016). Use of response time for measuring cognitive ability. Journal of Intelligence, 4(4), 14.
Lazer, D., Kennedy, R., King, G., Vespignani, A. (2014). The parable of google flu: traps in big data analysis. Science, 343(6176), 1203–1205.
Luna, J. M., Castro, C., Romero, C. (2017). Mdm tool: a data mining framework integrated into moodle. Computer Applications in Engineering Education, 25(1), 90–102.
MacHardy, Z., & Pardos, Z.A. (2015). Toward the evaluation of educational videos using bayesian knowledge tracing and big data. In Proceedings of the second (2015) ACM conference on learning @ scale, L@S ’15 (pp. 347–350): ACM.
MacKinnon, J.G. (2009). Bootstrap hypothesis testing, chapter 6, pp. 183–213. John Wiley & Sons, Ltd.
Meyer, J.P., & Zhu, S. (2013). Fair and equitable measurement of student learning in moocs: an introduction to item response theory, scale linking, and score equating. Research & Practice in Assessment, 8, 26–39.
Müller, O., Junglas, I., Brocke, J.V., Debortoli, S. (2016). Utilizing big data analytics for information systems research: challenges, promises and guidelines. European Journal of Information Systems, 25(4), 289–302.
Northcutt, C.G., Ho, A.D., Chuang, I.L. (2016). Detecting and preventing “multiple-account” cheating in massive open online courses. Computers in Education, 100(C), 71–80.
O’Neil, C. (2017). Weapons of math destruction: how big data increases inequality and threatens democracy. Broadway Books.
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251). ISSN 0036-8075.
Pardo, A., Mirriahi, N., Martinez-Maldonado, R., Jovanovic, J., Dawson, S., Gašević, D. (2016). Generating actionable predictive models of academic performance. In Proceedings of the sixth international conference on learning analytics & knowledge (pp. 474–478): ACM.
Pardos, Z.A., Tang, S., Davis, D., Le, C.V. (2017). Enabling real-time adaptivity in MOOCs with a personalized next-step recommendation framework. In Proceedings of the fourth (2017) ACM conference on learning @ scale - L@S ’17. ISBN 9781450344500. https://doi.org/10.1145/3051457.3051471.
Perez, S., Massey-Allard, J., Butler, D., Ives, J., Bonn, D., Yee, N., Roll, I. (2017). Identifying productive inquiry in virtual labs using sequence mining. In André, E., Baker, R., Hu, X., Rodrigo, M.M.T., du Boulay, B. (Eds.) Artificial intelligence in education (pp. 287–298). Cham: Springer International Publishing.
Qiu, J., Tang, J., Liu, T. X., Gong, J., Zhang, C., Zhang, Q., Xue, Y. (2016). Modeling and predicting learning behavior in moocs. In Proceedings of the ninth ACM international conference on web search and data mining (pp. 93–102): ACM.
Reich, J., & Ruipérez-Valiente, J.A. (2019). The MOOC pivot. Science, 363 (6423), 130–131.
Romero, C., & Ventura, S. (2017). Educational data science in massive open online courses. Wiley interdisciplinary reviews: data mining and knowledge discovery, WIREs Data Mining Knowl Discov, 01. https://doi.org/10.1002/widm.1187.
Rosen, Y., Rushkin, I., Ang, A., Federicks, C., Tingley, D., Blink, M.J. (2017). Designing adaptive assessments in MOOCs. In Proceedings of the fourth (2017) ACM conference on learning @ scale, L@S ’17. ISBN 978-1-4503-4450-0 (pp. 233–236).
Ruipérez-Valiente, J.A., Alexandron, G., Chen, Z., Pritchard, D.E. (2016). Using multiple accounts for harvesting solutions in MOOCs. In Proceedings of the third (2016) ACM conference on learning @ scale - L@S ’16 (pp. 63–70).
Ruipérez-Valiente, J.A., Joksimović, S., Kovanović, V., Gašević, D., Muñoz Merino, P.J., Delgado Kloos, C. (2017a). A data-driven method for the detection of close submitters in online learning environments. In Proceedings of the 26th international conference on world wide web companion (pp. 361–368).
Ruipérez-Valiente, J.A., Muñoz-Merino, P.J., Alexandron, G., Pritchard, D.E. (2017b). Using machine learning to detect ‘multiple-account’ cheating and analyze the influence of student and problem features. IEEE Transactions on Learning Technologies, 14(8), 1–11.
Ruipérez-Valiente, J.A., Muñoz-Merino, P.J., Gascón-Pinedo, J.A., Kloos, C.D. (2017c). Scaling to massiveness with ANALYSE: a learning analytics tool for Open edX. IEEE Transactions on Human-Machine Systems, 47(6), 909–914.
Saltelli, A., Chan, K., Scott, E.M., et al. (2000). Sensitivity analysis Vol. 1. New York: Wiley.
Seshia, S.A., & Sadigh, D. (2016). Towards verified artificial intelligence. CoRR, arXiv:1606.08514, .
Siemens, G. (2013). Learning analytics: the emergence of a discipline. American Behavioral Scientist, 57(10), 1380–1400.
Silver, N. (2012). The signal and the noise: why so many predictions fail–but some don’t. Penguin.
U.S. Department of Education, Office of Educational Technology. (2012). Enhancing teaching and learning through educational data mining and learning analytics: an issue brief.
van der Zee, T., & Reich, J. (2018). Open education science. AERA Open, 4 (3), 2332858418787466.
Xing, W., Chen, X., Stein, J., Marcinkowski, M. (2016). Temporal predication of dropouts in moocs Reaching the low hanging fruit through stacking generalization. Comput. Hum. Behav., 58, 119–129.
Yudelson, M., Fancsali, S., Ritter, S., Berman, S., Nixon, T., Joshi, A. (2014). Better data beats big data. In Educational data mining 2014.
GA’s research is supported by the Israeli Ministry of Science and Technology under project no. 713257.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix 1: Sampling Distribution of Mean IRT Ability
Appendix 2: Replication Study 1 with 2X Simulated Fake Learners
Appendix 3: Figures from Original Papers
About this article
Cite this article
Alexandron, G., Yoo, L.Y., Ruipérez-Valiente, J.A. et al. Are MOOC Learning Analytics Results Trustworthy? With Fake Learners, They Might Not Be!. Int J Artif Intell Educ 29, 484–506 (2019). https://doi.org/10.1007/s40593-019-00183-1
- Learning Analytics
- Replication research
- Sensitivity analysis
- Fake learners