Unsupervised Automatic Detection of Learners’ Programming Behavior

  • Anis BeyEmail author
  • Mar Pérez-Sanagustín
  • Julien Broisin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11722)


Programming became one of the most demanded professional skills. This reality is driving practitioners to search out better approaches for figuring out how to code and how to support learning programming processes. Prior works have focused on discovering, identifying, and characterizing learning programming patterns that better relate to success. Researchers propose qualitative and supervised analytic methods based on trace data from coding tasks. However, these methods are limited for automatically identifying students in difficulties without human-intervention support. The main goal of this paper is to introduce a three-phase process and a case study in which unsupervised clustering techniques are used for automatically identifying learners’ programming behavior. The case study takes place in a Shell programming course in which we analyzed data from 100 students to extract learners’ behavioral trajectories that positively correlate with success. As a result, we identified: (1) a list of features that improve the quality of the automatic learners’ profiles identification process, and (2) some students’ behavioral trajectories correlated with their performance at the final exam.


Learning programming Educational data mining Unsupervised analysis methods Learners’ behavior Learning analytics 


  1. 1.
  2. 2. Accessed 25 June 2019
  3. 3.
  4. 4.
  5. 5.
    Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97(458), 611–631 (2002)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Fraley, C., Raftery, A.E., Murphy, T.B., Scrucca, L.: mclust Version 4 for R: Normal Mixture Modeling for Model-Based Clustering, Classification, and Density Estimation. Technical Report (2012)Google Scholar
  7. 7.
    Abbott, A., Tsay, A.: Sequence analysis and optimal matching methods in sociology: review and prospect. Sociol. Methods Res. 29(1), 3–33 (2000)CrossRefGoogle Scholar
  8. 8.
    Blikstein, P.: Using learning analytics to assess students’ behavior in open-ended programming tasks. In: Proceedings of the 1st International Conference on Learning Analytics and Knowledge, pp. 110–116. ACM (2011)Google Scholar
  9. 9.
    Sharma, K., Mangaroska, K., Trætteberg, H., Lee-Cultura, S., Giannakos, M.: Evidence for programming strategies in university coding exercises. In: Pammer-Schindler, V., Pérez-Sanagustín, M., Drachsler, H., Elferink, R., Scheffel, M. (eds.) EC-TEL 2018. LNCS, vol. 11082, pp. 326–339. Springer, Cham (2018). Scholar
  10. 10.
    Jadud, M.C.: Methods and tools for exploring novice compilation behaviour. In: Proceedings of the Second International Workshop on Computing Education Research, pp. 73–84. ACM (2006)Google Scholar
  11. 11.
    Perkins, D.N., Hancock, C., Hobbs, R., Martin, F., Simmons, R.: Conditions of learning in novice programmers. J. Educ. Comput. Res. 2(1), 37–55 (1986)CrossRefGoogle Scholar
  12. 12.
    Reiser, B., Anderson, J., Farrell, R.: Dynamic student modelling in an intelligent tutor for LISP programming. In: Proceedings of the 9th International Joint Conferences on Artificial Intelligence, pp. 8–14 (1985)Google Scholar
  13. 13.
    Blikstein, P., Worsley, M., Piech, C., Sahami, M., Cooper, S., Koller, D.: Proramming pluralism: using learning analytics to detect patterns in the learning of computer programming. J. Learn. Sci. 23(4), 561–599 (2014)CrossRefGoogle Scholar
  14. 14.
    Berland, M., Martin, T.: Clusters and patterns of novice programmers. In: The Meeting of the American Educational Research Association (2011)Google Scholar
  15. 15.
    Blikstein, P., Worsley, M.: Learning analytics: assessing constructionist learning using machine learning. In: American Educational Research Association Annual Meeting (2011)Google Scholar
  16. 16.
    Blikstein, P.: An Atom is known by the company it keeps. Unpublished Ph.D. dissertation, Northwestern University, Evanston (2008)Google Scholar
  17. 17.
    Lawson, R.G., Jurs, P.C.: New index for clustering tendency and its application to chemical problems. J. Chem. Inf. Comput. Sci. 30(1), 36–41 (1990)CrossRefGoogle Scholar
  18. 18.
    Nguyen, A., Piech, C., Huang, J., Guibas, L.: Codewebs: scalable homework search for massive open online programming courses. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 491–502. ACM (2014)Google Scholar
  19. 19.
    Tabanao, E.S., Rodrigo, M.M.T., Jadud, M.C.: Predicting at-risk novice Java programmers through the analysis of online protocols. In: Proceedings of the 7th International Workshop on Computing Education Research, pp. 85–92. ACM (2011)Google Scholar
  20. 20.
    Kato, T., Kambayashi, Y., Terawaki, Y., Kodama, Y.: Analysis of students’ behaviors in programming exercises using deep learning. In: Uskov, V.L., Howlett, R.J., Jain, L.C. (eds.) SEEL 2017. SIST, vol. 75, pp. 38–47. Springer, Cham (2018). Scholar
  21. 21.
    Wang, L., Sy, A., Liu, L., Piech, C.: Learning to represent student knowledge on programming exercises using deep learning. In: Proceedings of the 10th International Conference on Educational Data Mining, pp. 324–329 (2017)Google Scholar
  22. 22.
    Ihantola, P., et al.: Educational data mining and learning analytics in programming: literature review and case studies. In: Proceedings of the 2015 ITiCSE on Working Group Reports, pp. 41–63. ACM (2015)Google Scholar
  23. 23.
    Bey, A., Jermann, P., Dillenbourg, P.: A comparison between two automatic assessment approaches for programming: an empirical study on MOOCs. Educ. Technol. Soc. 21(2), 259–272 (2018)Google Scholar
  24. 24.
    Broisin, J., Venant, R., Vidal, P.: Lab4CE: a remote laboratory for computer education. Int. J. Artif. Intell. Educ. 27(1), 154–180 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Anis Bey
    • 1
    Email author
  • Mar Pérez-Sanagustín
    • 1
    • 2
  • Julien Broisin
    • 1
  1. 1.Institut de Recherche en Informatique de Toulouse, IRITUniversité Paul Sabatier Toulouse IIIToulouseFrance
  2. 2.Pontificia Universidad Católica de ChileSantiagoChili

Personalised recommendations