Predicting Student Performance from Multiple Data Sources
The goal of this study is to (i) understand the characteristics of high-, average- and low-level performing students in a first year computer programming course, and (ii) investigate whether their performance can be predicted accurately and early enough in the semester for timely intervention. We triangulate data from three sources: submission steps and outcomes in an automatic marking system that provides instant feedback, assessment marks during the semester and student engagement with the discussion forum Piazza. We define and extract attributes characterizing student activity and performance, and discuss the distinct characteristics of the three groups. Using these attributes we built a compact decision tree classifier that is able to predict the exam mark with an accuracy of 72.69% at the end of the semester and 66.52% in the middle of the semester. We discuss the most important predictors and how such analysis can be used to improve teaching and learning.
KeywordsComputer science education Student performance prediction
Unable to display preview. Download preview PDF.
- 1.Romero, C., Ventura, S., Espejo, P.G., Hervas, C.: Data mining algorithms to classify students. In: Int. Conference on Educational Data Mining (EDM), pp. 8–17 (2008)Google Scholar