Skip to main content
Log in

Beyond binary correctness: Classification of students’ answers in learning systems

  • Published:
User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Abstract

Adaptive learning systems collect data on student performance and use them to personalize system behavior. Most current personalization techniques focus on the correctness of answers. Although the correctness of answers is the most straightforward source of information about student state, research suggests that additional data are also useful, e.g., response times, hints usage, or specific values of incorrect answers. However, these sources of data are not easy to utilize and are often used in an ad hoc fashion. We propose to use answer classification as an interface between raw data about student performance and algorithms for adaptive behavior. Specifically, we propose a classification of student answers into six categories: three classes of correct answers and three classes of incorrect answers. The proposed classification is broadly applicable and makes the use of additional interaction data much more feasible. We support the proposal by analysis of extensive data from adaptive learning systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Aleven, V., Koedinger, K.R.: Limitations of student control: do students know when they need help?. In: Proceedings of Intelligent Tutoring Systems, pp. 292–303. Springer, Berlin (2000)

  • Aleven, V., Stahl, E., Schworm, S., Fischer, F., Wallace, R.: Help seeking and help design in interactive learning environments. Rev. Educ. Res. 73(3), 277–320 (2003)

    Article  Google Scholar 

  • Aleven, V., McLaughlin, E.A., Glenn, R.A., Koedinger, K.R.: Instruction based on adaptive learning technologies. In: Mayer, R.E., Alexander, P.A. (eds.) Handbook of Research on Learning and Instruction, pp. 522–559. Routledge, Abingdon (2016)

    Google Scholar 

  • Arroyo, I., Woolf, B.P., Burelson, W., Muldner, K., Rai, D., Tai, M.: A multimedia adaptive tutoring system for mathematics that addresses cognition, metacognition and affect. Int. J. Artif. Intell. Educ. 24(4), 387–426 (2014)

    Article  Google Scholar 

  • Baker, R.S.: Modeling and understanding students’ off-task behavior in intelligent tutoring systems. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1059–1068. ACM (2007)

  • Baker, R.S.: Stupid tutoring systems, intelligent humans. Int. J. Artif. Intell. Educ. 26(2), 600–614 (2016)

    Article  MathSciNet  Google Scholar 

  • Baker, R.S., Corbett, A.T., Aleven, V.: More accurate student modeling through contextual estimation of slip and guess probabilities in Bayesian knowledge tracing. In: Proceedings of Intelligent Tutoring Systems, pp. 406–415. Springer (2008)

  • Baker, R.S., Gowda, S.M., Wixon, M., Kalka, J., Wagner, A.Z., Salvi, A., Aleven, V., Kusbit, G.W., Ocumpaugh, J., Rossi, L.: Towards sensor-free affect detection in cognitive tutor algebra. In: Proceedings of Educational Data Mining (2012)

  • Barnes, T.: The q-matrix method: Mining student response data for knowledge. In: American Association for Artificial Intelligence 2005 Educational Data Mining Workshop, pp. 1–8 (2005)

  • Beck, J.E.: Engagement tracing: using response times to model student disengagement. In: Proceedings of Artificial Intelligence in Education, vol. 125 (2005)

  • Beck, J.E., Gong, Y.: Wheel-spinning: students who fail to master a skill. In: Proceedings of Artificial Intelligence in Education, pp. 431–440. Springer (2013)

  • Beck, J.E., Chang, K.-m., Mostow, J., Corbett, A.: Does help help? Introducing the Bayesian evaluation and assessment methodology. In: Proceedings of Intelligent Tutoring Systems, pp. 383–394. Springer (2008)

  • Bull, S., Kay, J.: Student models that invite the learner in: the smili open learner modelling framework. Int. J. Artif. Intell. Educ. 17(2), 89–120 (2007)

    Google Scholar 

  • Burrows, S., Gurevych, I., Stein, B.: The eras and trends of automatic short answer grading. Int. J. Artif. Intell. Educ. 25(1), 60–117 (2015)

    Article  Google Scholar 

  • Butler, A.C., Karpicke, J.D., Roediger III, H.L.: The effect of type and timing of feedback on learning from multiple-choice tests. J. Exp. Psychol. Appl. 13(4), 273 (2007)

    Article  Google Scholar 

  • Desmarais, M.C., Baker, R.S.: A review of recent advances in learner and skill modeling in intelligent learning environments. User Model. User-Adap. Interact. 22(1–2), 9–38 (2012)

    Article  Google Scholar 

  • Desmarais, M., Beheshti, B., Xu, P.: The refinement of a q-matrix: assessing methods to validate tasks to skills mapping. In: Proceedings of Educational Data Mining (2014)

  • Drasgow, F., Levine, M.V., Tsien, S., Williams, B., Mead, A.D.: Fitting polytomous item response theory models to multiple-choice tests. Appl. Psychol. Meas. 19(2), 143–166 (1995)

    Article  Google Scholar 

  • Eagle, M., Hicks, D., Peddycord III, B., Barnes, T.: Exploring networks of problem-solving interactions. In: Proceedings of Learning Analytics and Knowledge, pp. 21–30. ACM (2015)

  • Effenberger, T., Pelánek, R.: Towards making block-based programming activities adaptive. In: Proceedings of Learning at Scale, p. 13. ACM (2018)

  • Effenberger, T., Pelánek, R.: Measuring students’ performance on programming tasks. In: Proceedings of Learning at Scale. ACM (2019)

  • Gierl, M.J., Bulut, O., Guo, Q., Zhang, X.: Developing, analyzing, and using distractors for multiple-choice tests in education: a comprehensive review. Rev. Educ. Res. 87(6), 1082–1116 (2017)

    Article  Google Scholar 

  • Goldin, I., Koedinger, K., Aleven, V.: Hints: you can’t have just one. In: Proceedings of Educational Data Mining (2013)

  • Hattie, J., Gan, M.: Instruction based on feedback. In: Mayer, R.E., Alexander, P.A. (eds.) Handbook of Research on Learning and Instruction, pp. 249–271. Routledge, Abingdon (2011)

    Google Scholar 

  • Inventado, P.S., Scupelli, P., Ostrow, K., Heffernan, N., Ocumpaugh, J., Almeda, V., Slater, S.: Contextual factors affecting hint utility. Int. J. STEM Educ. 5(1), 13 (2018)

    Article  Google Scholar 

  • Klinkenberg, S., Straatemeier, M., Van der Maas, H.: Computer adaptive practice of maths ability using a new item response model for on the fly ability and difficulty estimation. Comput. Educ. 57(2), 1813–1824 (2011)

    Article  Google Scholar 

  • Masters, G.N.: A rasch model for partial credit scoring. Psychometrika 47(2), 149–174 (1982)

    Article  MATH  Google Scholar 

  • McTavish, T.S., Larusson, J.A.: Labeling mathematical errors to reveal cognitive states. In: Rensing, C., de Freitas, S., Ley, T., Muñoz-Merino, P.J. (eds.) Open Learning and Teaching in Educational Communities, pp. 446–451. Springer, Cham (2014)

    Chapter  Google Scholar 

  • Merceron, A., Yacef, K.: Clustering students to help evaluate learning. In: Courtiat, J.P., Davarakis, C., Villemur, T. (eds.) Technology Enhanced Learning, pp. 31–42. Springer, Boston (2005)

    Chapter  Google Scholar 

  • Mettler, E., Massey, C.M., Kellman, P.J.: Improving adaptive learning technology through the use of response times. In: Proceedings of Conference of the Cognitive Science Society, pp. 2532–2537 (2011)

  • Molenaar, I., Knoop-van Campen, C.: Teacher dashboards in practice: usage and impact. In: Proceedings of European Conference on Technology Enhanced Learning, pp. 125–138. Springer (2017)

  • Nam, S., Frishkoff, G.A., Collins-Thompson, K.: Predicting short-and long-term vocabulary learning via semantic features of partial word knowledge. In: Proceedings of Educational Data Mining (2017)

  • Ostrow, K., Donnelly, C., Adjei, S., Heffernan, N.: Improving student modeling through partial credit and problem difficulty. In: Proceedings of Learning at Scale, pp. 11–20. ACM (2015)

  • Papoušek, J., Pelánek, R., Řihák, J., Stanislav, V.: An analysis of response times in adaptive practice of geography facts. In: Proceedings of Educational Data Mining, pp. 562–563 (2015)

  • Paquette, L., Baker, R.S., Sao Pedro, M.A., Gobert, J.D., Rossi, L., Nakama, A., Kauffman-Rogoff, Z.: Sensor-free affect detection for a simulation-based science inquiry learning environment. In: International Conference on Intelligent Tutoring Systems, pp. 1–10. Springer (2014)

  • Pelánek, R.: Bayesian knowledge tracing, logistic models, and beyond: an overview of learner modeling techniques. User Model. User-Adap. Interact. 27(3), 313–350 (2017)

    Article  Google Scholar 

  • Pelánek, R.: The details matter: methodological nuances in the evaluation of student models. User Model. User-Adap. Interact. 28, 207–235 (2018a)

    Article  Google Scholar 

  • Pelánek, R.: Exploring the utility of response times and wrong answers for adaptive learning. In: Proceedings of Learning at Scale. ACM (2018b)

  • Pelánek, R.: Measuring similarity of educational items: an overview. IEEE Trans. Learn. Technol. (2019). https://doi.org/10.1109/TLT.2019.2896086

    Article  Google Scholar 

  • Pelánek, R., Jarušek, P.: Student modeling based on problem solving times. Int. J. Artif. Intell. Educ. 25(4), 493–519 (2015)

    Article  Google Scholar 

  • Pelánek, R., Řihák, J.: Properties and applications of wrong answers in online educational systems. In: Proceedings of Educational Data Mining, pp. 466–471 (2016)

  • Pelánek, R., Řihák, J.: Analysis and design of mastery learning criteria. New Rev. Hypermed. Multimed. 24, 133–159 (2018)

    Article  Google Scholar 

  • Piech, C., Bassen, J., Huang, J., Ganguli, S., Sahami, M., Guibas, L.J., Sohl-Dickstein, J.: Deep knowledge tracing. In: Proceedings of Advances in neural Information Processing Systems, pp. 505–513 (2015a)

  • Piech, C., Huang, J., Nguyen, A., Phulsuksombati, M., Sahami, M., Guibas, L.: Learning program embeddings to propagate feedback on student code. In: Proceedings of International Conference on Machine Learning, volume 37 of ICML’15, pp. 1093–1102 (2015b)

  • Řihák, J., Pelánek, R.: Choosing a student model for a real world application. In: Proceedings of Building ITS Bridges Across Frontiers (ITS Workshop) (2016)

  • Shih, B., Koedinger, K.R., Scheines, R.: A response time model for bottom-out hints as worked examples. In: Romero, C., Ventura, S., Pechenizkiy, M., Baker, R.S. (eds.) Handbook of Educational Data Mining, pp. 201–212. CRC Press, Boca Raton (2011)

    Google Scholar 

  • Stephens-Martinez, K., Ju, A., Parashar, K., Ongowarsito, R., Jain, N., Venkat, S., Fox, A.: Taking advantage of scale by analyzing frequent constructed-response, code tracing wrong answers. In: Proceedings of International Computing Education Research, pp. 56–64. ACM (2017)

  • Straatemeier, M.: Math Garden: A new educational and scientific instrument. PhD thesis, Universiteit van Amsterdam, Faculty of Social and Behavioural Sciences (2014)

  • Van Der Linden, W.: Conceptual issues in response-time modeling. J. Educ. Meas. 46(3), 247–272 (2009)

    Article  Google Scholar 

  • Van Inwegen, E.G., Adjei, S.A., Wang, Y., Heffernan, N.T.: Using partial credit and response history to model user knowledge. In: Proceedings of Educational Data Mining (2015)

  • Wang, Y., Heffernan, N.T., Heffernan, C.: Towards better affect detectors: effect of missing skills, class features and common wrong answers. In: Proceedings of Learning Analytics And Knowledge, pp. 31–35. ACM (2015)

  • Wang, Y., Heffernan, N.: Extending knowledge tracing to allow partial credit: using continuous versus binary nodes. In: Proceedings of Artificial Intelligence in Education, pp. 181–188. Springer (2013)

  • Wang, L., Sy, A., Liu, L., Piech, C.: Learning to represent student knowledge on programming exercises using deep learning. In: Proceedings of Educational Data Mining, pp. 324–329 (2017)

  • Woolf, B., Burleson, W., Arroyo, I., Dragon, T., Cooper, D., Picard, R.: Affect-aware tutors: recognising and responding to student affect. Int. J. Learn. Technol. 4(3–4), 129–164 (2009)

    Article  Google Scholar 

  • Zhang, L., Xiong, X., Zhao, S., Botelho, A., Heffernan, N.T.: Incorporating rich features into deep knowledge tracing. In: Proceedings of Learning at Scale, pp. 169–172. ACM (2017)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Radek Pelánek.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pelánek, R., Effenberger, T. Beyond binary correctness: Classification of students’ answers in learning systems. User Model User-Adap Inter 30, 867–893 (2020). https://doi.org/10.1007/s11257-020-09265-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11257-020-09265-5

Keywords

Navigation