Predicting Student Performance from Multiple Data Sources

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9112)


The goal of this study is to (i) understand the characteristics of high-, average- and low-level performing students in a first year computer programming course, and (ii) investigate whether their performance can be predicted accurately and early enough in the semester for timely intervention. We triangulate data from three sources: submission steps and outcomes in an automatic marking system that provides instant feedback, assessment marks during the semester and student engagement with the discussion forum Piazza. We define and extract attributes characterizing student activity and performance, and discuss the distinct characteristics of the three groups. Using these attributes we built a compact decision tree classifier that is able to predict the exam mark with an accuracy of 72.69% at the end of the semester and 66.52% in the middle of the semester. We discuss the most important predictors and how such analysis can be used to improve teaching and learning.


Computer science education Student performance prediction 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Romero, C., Ventura, S., Espejo, P.G., Hervas, C.: Data mining algorithms to classify students. In: Int. Conference on Educational Data Mining (EDM), pp. 8–17 (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Irena Koprinska
    • 1
  • Joshua Stretton
    • 1
  • Kalina Yacef
    • 1
  1. 1.School of Information TechnologiesUniversity of SydneySydneyAustralia

Personalised recommendations