Advertisement

Educational Data Mining and Learning Analytics

  • Ryan Shaun BakerEmail author
  • Paul Salvador Inventado

Abstract

In recent years, two communities have grown around a joint interest on how big data can be exploited to benefit education and the science of learning: Educational Data Mining and Learning Analytics. This article discusses the relationship between these two communities, and the key methods and approaches of educational data mining. The article discusses how these methods emerged in the early days of research in this area, which methods have seen particular interest in the EDM and learning analytics communities, and how this has changed as the field matures and has moved to making significant contributions to both educational research and practice.

Keywords

Association Rule Association Rule Mining Intelligent Tutoring System Sequential Pattern Mining Learn Analytics 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. Aleven, V., Mclaren, B., Roll, I., & Koedinger, K. (2006). Toward meta-cognitive tutoring: A model of help seeking with a cognitive tutor. International Journal of Artificial Intelligence in Education, 16(2), 101–128.Google Scholar
  2. Amershi, S., & Conati, C. (2009). Combining unsupervised and supervised classification to build user models for exploratory learning environments. Journal of Educational Data Mining, 1(1), 18–71.Google Scholar
  3. Arroyo, I., & Woolf, B. (2005). Inferring learning and attitudes from a Bayesian Network of log file data. In: Proceedings of the 12th International Conference on Artificial Intelligence in Education (pp. 33–40).Google Scholar
  4. Baker, R., Corbett. A. T., Koedinger, K., & Wagner, A. Z. (2004). Off-task behavior in the cognitive tutor classroom: When students game the system. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 383–390).Google Scholar
  5. Baker, R., de Carvalho, A., Raspat, J., Aleven, V., Corbett, A., & Koedinger, K. (2009). Educational software features that encourage and discourage “gaming the system”. In: Proceedings of the International Conference on Artificial Intelligence in Education (pp. 475–482).Google Scholar
  6. Baker, R., & Gowda, S. (2010). An analysis of the differences in the frequency of students’ disengagement in urban, rural, and suburban high schools. In: Proceedings of the 3rd International Conference on Educational Data Mining (pp. 11–20).Google Scholar
  7. Baker, R., Gowda, S. M., & Corbett, A. T. (2011a). Towards predicting future transfer of learning. In G. Biswas, S. Bull, J. Kay, & A. Mitrovic (Eds.), Artificial intelligence in education: Vol. 6738. Lecture notes in computer science (pp. 23–30). Heidelberg, Germany: Springer.Google Scholar
  8. Baker, R., Gowda, S. M., & Corbett, A. T. (2011b). Automatically detecting a student’s preparation for future learning: Help use is key. In Proceedings of the 4th International Conference on Educational Data Mining (pp. 179–188).Google Scholar
  9. Baker, R., Kalka, J., Aleven, V., Rossi, L., Gowda, S., Wagner, A., et al. (2012). Towards sensor-free affect detection in cognitive tutor algebra. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 126–133).Google Scholar
  10. Baker, R., Walonoski, J., Heffernan, N., Roll, I., Corbett, A., & Koedinger, K. (2008). Why students engage in “gaming the system” behavior in interactive learning environments. Journal of Interactive Learning Research, 19(2), 185–224.Google Scholar
  11. Baker, R., & Yacef, K. (2009). The state of educational data mining in 2009: A review and future visions. Journal of Educational Data Mining, 1(1), 3–17.Google Scholar
  12. Bakharia, A., & Dawson, S. (2011). SNAPP: A bird’s-eye view of temporal participant interaction. In: Proceedings of the 1st International Conference on Learning Analytics and Knowledge (pp. 168–173).Google Scholar
  13. Barnes, T. (2005). The q-matrix method: Mining student response data for knowledge. In: Proceedings of the American Association for Artificial Intelligence 2005 Educational Data Mining Workshop (pp. 39–46).Google Scholar
  14. Barnes, T., Bitzer, D., & Vouk, M. (2005). Experimental analysis of the q-matrix method in knowledge discovery. In M.-S. Hacid, N. Murray, Z. Raś, & S. Tsumoto (Eds.), Foundations of intelligent systems: Vol. 3488. Lecture notes in computer science (pp. 603–611). Heidelberg, Germany: Springer.Google Scholar
  15. Beal, C. R., Qu, L., & Lee, H. (2006). Classifying learner engagement through integration of multiple data sources. In: Proceedings of the 21st National Conference on Artificial Intelligence (pp. 151–156).Google Scholar
  16. Beheshti, B., & Desmarais, M. (2012). Improving matrix factorization techniques of student test data with partial order constraints. In: Proceedings of the 20th International Conference on User Modeling, Adaptation, and Personalization (pp. 346–350).Google Scholar
  17. Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, 57(1), 289–300.Google Scholar
  18. Ben-Naim, D., Bain, M., & Marcus, N. (2009). A user-driven and data-driven approach for supporting teachers in reflection and adaptation of adaptive tutorials. In: Proceedings of the 2nd International Conference on Educational Data Mining (pp. 21–30).Google Scholar
  19. Bouchet, F., Azevedo, R., Kinnebrew, J., & Biswas, G. (2012). Identifying students’ characteristic learning behaviors in an intelligent tutoring system fostering self-regulated learning. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 65–72).Google Scholar
  20. Brin, S., Motwani, R., Ullman, J., & Tsur, S. (1997). Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the 1997 ACM International Conference on Management of Data (pp. 255–264).Google Scholar
  21. Cen, H., Koedinger, K., & Junker, B. (2006). Learning factors analysis—A general method for cognitive model evaluation and improvement. In M. Ikeda, K. Ashley, & T.-W. Chan (Eds.), Intelligent tutoring systems: Vol. 4053. Lecture notes in computer science (pp. 164–175). Heidelberg, Germany: Springer.Google Scholar
  22. Cen, H., Koedinger, K., & Junker, B. (2007). Is over practice necessary?—Improving learning efficiency with the cognitive tutor through educational data mining. In: Proceedings of 13th International Conference on Artificial Intelligence in Education (pp. 511–518).Google Scholar
  23. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.CrossRefGoogle Scholar
  24. Corbett, A., & Anderson, J. (1995). Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction, 4(4), 253–278.CrossRefGoogle Scholar
  25. d’Aquin, M., & Jay, N. (2013). Interpreting data mining results with linked data for learning analytics: Motivation, case study and directions. In: Proceedings of the 3rd International Conference on Learning Analytics and Knowledge (pp. 155–164).Google Scholar
  26. D’Mello, S., Craig, S., Witherspoon, A., Mcdaniel, B., & Graesser, A. (2008). Automatic detection of learner’s affect from conversational cues. User Modeling and User-Adapted Interaction, 18(1–2), 45–80.CrossRefGoogle Scholar
  27. D’Mello, S., Olney, A., & Person, N. (2010). Mining collaborative patterns in tutorial dialogues. Journal of Educational Data Mining, 2(1), 1–37.Google Scholar
  28. Davis, J., & Goadrich, M. (2006). The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning (pp. 233–240).Google Scholar
  29. Dawson, S. (2008). A study of the relationship between student social networks and sense of community. Educational Technology and Society, 11(3), 224–238.Google Scholar
  30. Dekker, G., Pechenizkiy, M., & Vleeshouwers, J. (2009). Predicting students drop out: A case study. In: Proceedings of 2nd International Conference on Educational Data Mining (pp. 41–50).Google Scholar
  31. Desmarais, M. (2011). Conditions for effectively deriving a q-matrix from data with non-negative matrix factorization. In: Proceedings of the 4th International Conference on Educational Data Mining (pp. 41–50).Google Scholar
  32. Desmarais, M., Beheshti, B., & Naceur, R. (2012). Item to skills mapping: Deriving a conjunctive q-matrix from data. In S. A. Cerri, W. J. Clancey, G. Papadourakis, & K.-K. Panourgia (Eds.), Intelligent tutoring systems: Vol. 7315. Lecture notes in computer science (pp. 454–463). Heidelberg, Germany: Springer.Google Scholar
  33. Fancsali, S. (2012). Variable construction and causal discovery for cognitive tutor log data: Initial results. In: Proceedings of the 5th Conference on Educational Data Mining (pp. 238–239).Google Scholar
  34. Feng, M., & Heffernan, N. (2007). Towards live informing and automatic analyzing of student learning: Reporting in the assistment system. Journal of Interactive Learning Research, 18(2), 207–230.Google Scholar
  35. Feng, M., Heffernan, N., & Koedinger, K. (2009). Addressing the assessment challenge with an online system that tutors as it assesses. User Modeling and User-Adapted Interaction, 19(3), 243–266.CrossRefGoogle Scholar
  36. Goldin, I., Koedinger, K. R., & Aleven, V. (2012). Learner differences in hint processing. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 73–80).Google Scholar
  37. Gong, Y., Beck, J. E., & Heffernan, N. T. (2011). How to construct more accurate student models: Comparing and optimizing knowledge tracing and performance factor analysis. International Journal of Artificial Intelligence in Education, 21(1), 27–46.Google Scholar
  38. Hanley, A., & McNeil, B. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1), 29–36.Google Scholar
  39. Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. New York: Wiley.CrossRefGoogle Scholar
  40. Kay, J., Maisonneuve, N., Yacef, K., & Zaïane, O. (2006). Mining patterns of events in students’ teamwork data. In: Proceedings of the Workshop on Educational Data Mining at the 8th International Conference on Intelligent Tutoring Systems (pp. 45–52).Google Scholar
  41. Kinnebrew, J., & Biswas, G. (2012). Identifying learning behaviors by contextualizing differential sequence mining with action features and performance evolution. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 57–64).Google Scholar
  42. Kline, P. (1993). An easy guide to factor analysis. London: Routledge.Google Scholar
  43. Koedinger, K., McLaughlin, E., & Stamper, J. (2012). Automated student model improvement. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 17–24).Google Scholar
  44. Lin, J., Keogh, E., Lonardi, S., & Patel, P. (2002). Finding motifs in time series. In: Proceedings of the 2nd Workshop on Temporal Data Mining (pp. 53–68).Google Scholar
  45. Martin, J., & VanLehn, K. (1995). Student assessment using Bayesian nets. International Journal of Human Computer Studies, 42(6), 575–592.CrossRefGoogle Scholar
  46. Martinez, R., Yacef, K., Kay, J., Kharrufa, A., & Al-Qaraghuli, A. (2011). Analysing frequent sequential patterns of collaborative learning activity around an interactive tabletop. In: Proceedings of the 4th International Conference on Educational Data Mining (pp. 111–120).Google Scholar
  47. Merceron, A., & Yacef, K. (2005). Educational data mining: A case study. In: Proceedings of the 2005 Conference on Artificial Intelligence in Education: Supporting Learning Through Socially Informed Technology (pp. 467–474).Google Scholar
  48. Merceron, A., & Yacef, K. (2008). Interestingness measures for association rules in educational data. In: Proceedings of the 1st International Conference on Educational Data Mining (pp. 57–66).Google Scholar
  49. Minaei-Bidgoli, B., Kashy, D., Kortmeyer, G., & Punch, W. (2003). Predicting student performance: An application of data mining methods with an educational web-based system. In: Frontiers in Education, 2003. FIE 2003 33rd Annual (pp. T2A 13–18). (http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1263284&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F8925%2F28250%2F01263284.pdf%3Farnumber%3D1263284#).Google Scholar
  50. Pardos, Z., Baker, R., San Pedro, M., Gowda, S., & Gowda, S. (2013). Affective states and state tests: Investigating how affect throughout the school year predicts end of year learning outcomes. In: Proceedings of the 3rd International Conference on Learning Analytics and Knowledge (pp. 117–124).Google Scholar
  51. Pardos, Z. A., Gowda, S. M., Baker, R., & Heffernan, N. T. (2012). The sum is greater than the parts: Ensembling models of student knowledge in educational software. ACM SIGKDD Explorations Newsletter, 13(2), 37–44.CrossRefGoogle Scholar
  52. Pavlik, P., Cen, H., & Koedinger, K. R. (2009) Performance factors analysis—A new alternative to knowledge tracing. In: Proceedings of the 14th International Conference on Artificial Intelligence in Education (pp. 531–538).Google Scholar
  53. Perera, D., Kay, J., Koprinska, I., Yacef, K., & Zaïane, O. R. (2009). Clustering and sequential pattern mining of online collaborative learning data. IEEE Transactions on Knowledge and Data Engineering, 21(6), 759–772.CrossRefGoogle Scholar
  54. Rai, D., & Beck, J. (2011). Exploring user data from a game-like math tutor: A case study in causal modeling. In: Proceedings of the 4th International Conference on Educational Data Mining (pp. 307–313).Google Scholar
  55. Rau, A., & Scheines, R. (2012). Searching for variables and models to investigate mediators of learning from multiple representations. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 110–117).Google Scholar
  56. Roll, I., Aleven, V., McLaren, B. M., & Koedinger, K. R. (2007). Can help seeking be tutored? Searching for the secret sauce of metacognitive tutoring. In: Proceedings of the 13th International Conference on Artificial Intelligence in Education, Marina del Rey, CA (pp. 203–210).Google Scholar
  57. Romero, C., & Ventura, S. (2007). Educational data mining: A survey from 1995 to 2005. Expert Systems with Applications, 33(1), 135–146.CrossRefGoogle Scholar
  58. Romero, C., & Ventura, S. (2010). Educational data mining: A review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews, 40(6), 601–618.CrossRefGoogle Scholar
  59. Rus, V., Moldovan, C., Graesser, A., & Niraula, N. (2012). Automated discovery of speech act categories in educational games. In: Proceedings of the 5th International Conference on Educational Data Mining (pp. 25–32).Google Scholar
  60. San Pedro, M., Baker, R., Bowers, A., & Heffernan, N. (2013). Predicting college enrollment from student interaction with an intelligent tutoring system in middle school. In Proceedings of the 6th International Conference on Educational Data Mining (pp. 177–184).Google Scholar
  61. Sao Pedro, M., Baker, R., Montalvo, O., Nakama, A., & Gobert, J. D. (2010). Using text replay tagging to produce detectors of systematic experimentation behavior patterns. In: Proceedings of the 3rd International Conference on Educational Data Mining (pp. 181–190).Google Scholar
  62. Scheines, R., Spirtes, P., Glymour, C., Meek, C., & Richardson, T. (1998). The TETRAD project: Constraint based aids to causal model specification. Multivariate Behavioral Research, 33(1), 65–117.CrossRefGoogle Scholar
  63. Scheuer, O., & McLaren, B. M. (2011). Educational data mining. The encyclopedia of the sciences of learning. New York: Springer.Google Scholar
  64. Schreurs, B., Teplovs, C., Ferguson, R., De Laat, M., & Buckingham Shum, S. (2013). Visualizing social learning ties by type and topic: Rationale and concept demonstrator. In: Proceedings of the 3rd International Conference on Learning Analytics and Knowledge (pp. 33–37).Google Scholar
  65. Shanabrook, D. H., Cooper, D. G., Woolf, B. P., & Arroyo, I. (2010). Identifying high-level student behavior using sequence-based motif discovery. In: Proceedings of the 3rd International Conference on Educational Data Mining (pp. 191–200).Google Scholar
  66. Shute, V. J. (1995). SMART: Student modeling approach for responsive tutoring. User Modeling and User-Adapted Interaction, 5(1), 1–44.CrossRefGoogle Scholar
  67. Siemens, G., & Baker, R. (2012). Learning analytics and educational data mining: Towards communication and collaboration. In: Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (pp. 252–254).Google Scholar
  68. Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search. New York: MIT Press.Google Scholar
  69. Srikant, R., & Agrawal, R. (1996). Mining sequential patterns: Generalizations and performance improvements. Heidelberg, Germany: Springer.Google Scholar
  70. Storey, J. D. (2003). The positive false discovery rate: A Bayesian interpretation and the q-value. Annals of Statistics, 31(6), 2013–2035.CrossRefGoogle Scholar
  71. Suthers, D., & Rosen, D. (2011). A unified framework for multi-level analysis of distributed learning. In: Proceedings of the 1st International Conference on Learning Analytics and Knowledge (pp. 64–74).Google Scholar
  72. Tatsuoka, K. (1995). Architecture of knowledge structures and cognitive diagnosis: A statistical pattern recognition and classification approach. In P. Nichols, S. Chipman, & R. Brennan (Eds.), Cognitively diagnostic assessment (pp. 327–359). London: Routledge.Google Scholar
  73. Thai-Nghe, N., Horvath, T., & Schmidt-Thieme, L. (2011). Context-Aware factorization for personalized student’s task recommendation. In: Proceedings of the International Workshop on Personalization Approaches in Learning Environments (pp. 13–18).Google Scholar
  74. Vuong, A., Nixon, T., & Towle, B. (2011). A method for finding prerequisites within a curriculum. In: Proceedings of the 4th International Conference on Educational Data Mining (pp. 211–216).Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Teachers CollegeColumbia UniversityNew YorkUSA
  2. 2.Center for Empathic Human-Computer Interactions, College of Computer StudiesDe La Salle UniversityManilaPhilippines
  3. 3.Architecture for Intelligence, The Institute of Scientific and Industrial ResearchOsaka UniversityOsakaJapan

Personalised recommendations