Confrustion and Gaming While Learning with Erroneous Examples in a Decimals Game
- 1.2k Downloads
Prior studies have explored the potential of erroneous examples in helping students learn more effectively by correcting errors in solutions to decimal problems. One recent study found that while students experience more confusion and frustration (confrustion) when working with erroneous examples, they demonstrate better retention of decimal concepts. In this study, we investigated whether this finding could be replicated in a digital learning game. In the erroneous examples (ErrEx) version of the game, students saw a character play the games and make mistakes, and then they corrected the characters’ errors. In the problem solving (PS) version, students played the games by themselves. We found that confrustion was significantly, negatively correlated with performance in both pretest (r = −.62, p < .001) and posttest (r = −.68, p < .001) and so was gaming the system (pretest r = −.58, p < .001, posttest r = −.66, p < .001). Posthoc (Tukey) tests indicated that students who did not see any erroneous examples (PS-only) experienced significantly lower levels of confrustion (p < .001) and gaming (p < .001). While we did not find significant differences in post-test performance across conditions, our findings show that students working with erroneous examples experience consistently higher levels of confrustion in both game and non-game contexts.
KeywordsDigital learning game Erroneous Examples Affect Affect detection Confusion Frustration Gaming the system Learning outcomes
Researchers have investigated the value of solving problems using non-traditional approaches to problem solving. Worked examples [1, 2, 3] and erroneous examples [4, 5, 6] have been of particular interest. Worked examples demonstrate a procedure to arrive at a correct solution and may prompt students to provide explanations to correct steps of a solution while erroneous examples require them to identify and fix errors in incorrect solutions. The reason these approaches improve learning has been attributed to their role in freeing up cognitive resources that can then be used to learn new knowledge . Factors not specific to a particular approach may also interact with learning. Of these, affect and behavior have garnered the most attention [8, 9, 10, 11]. In particular, states of confusion, concentration and boredom have been shown to persist across computer-based learning environments (dialog tutors, problem-solving games, problem-solving intelligent tutors) .
In a recent study, we found that students who were assigned erroneous examples implemented in an intelligent tutor  experienced higher levels of confrustion , a mix of confusion and frustration, than those who were asked to answer typical problem-solving questions. However, we found that confrustion was negatively correlated with both immediate and delayed learning, albeit less so for students who worked with erroneous examples.
This study, which is a replication of our recent findings but in a game versus ITS context, was motivated by two observations. First, in order to determine whether this relationship is robust, it is important to explore whether our recent findings persist in other digital learning environments. This is because levels of affective states such as frustration and behaviors such as gaming the system have been shown to vary across learning environments and user interfaces [12, 15].
Second, research has shown that students who engage in gaming the system also experience frustration , though frustration does not always precede gaming . Therefore, it is interesting to explore if this association persists when erroneous examples are implemented in a digital learning game context.
H1: Confrustion and gaming will be negatively related to performance, even when controlling for prior knowledge.
H2: Students in any of the conditions that include erroneous examples will experience higher levels of confrustion and gaming the system.
H3: Students in any of the conditions that include erroneous examples will perform better than their PS-only counterparts in the posttest.
The data used in this study was collected in the spring of 2015. Participants were recruited from four teachers’ classes at two middle schools, and participated over four to five class sessions. Both schools are located in the metropolitan area of a city in the United States. The analysis for this study included the data of 191 students, divided into four conditions within the game context.
Materials consisted of the digital learning game, Decimal Point , and three isomorphic versions of a test administered as a pretest and posttest. The Decimal Point game is laid out on an amusement park map, with 24 mini-games in which students play two rounds of each. All tests and the game used the Cognitive Tutor Authoring Tool (CTAT)  as a tutoring backend. The game was designed with focus on common misconceptions middle school students have about decimals .
We used gameplay data to generate machine learning models to detect confrustion and gaming the system. In this study, we applied text replay coding [19, 20] to student logs to label 1,560 clips (irr κ = .74). To predict confrustion and gaming, the detectors used 23 features of the students’ interaction with the decimal tutor, involving the number of attempts, amount of time spent and restart behavior.
After evaluating the performance of several classification algorithms in terms of Area Under the Receiver Operating Characteristic Curve (AUC ROC) and Cohen’s Kappa (κ), we built the confrustion detector using the Extreme Gradient Boosting (XGBoost) ensemble tree-based classifier  (AUC ROC = .97, κ = .81) and the gaming detector using the J-Rip classifier  (AUC ROC = .85, κ = .62).
Confrustion was significantly, negatively correlated with performance on the pretest (r = −.62, p < .001) and posttest (r = −.68, p < .001). A multiple regression model tested using confrustion to predict posttest performance while controlling for pretest was also significant, F(2, 188) = 181.14, p < .001. Within the model, both pretest, (β = .57, p < .001) and confrustion (β = −.32, p < .001) were significant; confrustion was a significant, negative predictor of posttest performance even after controlling for pretest.
Gaming was significantly, negatively correlated with performance on the pretest (r = −.58, p < .001) and posttest (r = −.66, p < .001). A multiple regression model tested using gaming to predict posttest performance while controlling for pretest was also significant, F(2, 188) = 181.14, p < .001. Within the model, both pretest, (β = .59, p < .001) and gaming (β = −.31, p < .001) were significant, indicating that gaming was also a significant, negative predictor of posttest performance even after controlling for pretest.
Gaming, confrustion, and test performance by condition.
Posttest M (SD)
Gaming M (SD)
Confrustion M (SD)
Finally, a repeated-measure analysis of variance (ANOVA) indicated that students across all conditions improved significantly from pretest to posttest, F(3, 187) = 167.04, p < .001. See Table 1 for means and standard deviations across conditions. A series of ANOVAs indicated no significant differences across conditions on pretest, F(3, 187) = 1.63, p = .18, or posttest, F(3. 187) = 1.65, p = .18.
In this study, we implemented erroneous examples in a digital learning game context and found that students who played the erroneous examples versions of the game experienced higher levels of confrustion. There was also a significant correlation between gaming the system and confrustion. Future research might further explore the relationship between frustration and gaming, as previous research using affect detectors has found that frustration did not tend to precede gaming the system .
A previous study using a web-based intelligent tutor showed that students working with erroneous examples performed better than their problem-solving counterparts . This study, however, did not replicate that finding.
While it is not possible to make a direct comparison between confrustion levels in the game and intelligent tutor versions of the ErrEx condition, it is worth noting that students who played the game experienced higher levels of confrustion (M = 0.46, SD = 0.26) than those who used the intelligent tutor (M = 0.34, SD = 0.16) . Since confrustion has been shown to be significantly, negatively correlated with learning, these higher levels of confrustion may explain why we did not see better learning effects of erroneous examples in the game context.
Alternatively, integrating the game interface with a feature where students watch a game character play the game for them may have negatively impacted both the game experience and the intended benefit of erroneous examples.
In an upcoming study, we will explore mechanisms intended to reduce the negative impact of confrustion and gaming on learning with erroneous examples in a digital learning game.
- 2.McLaren, B.M., Lim, S., Koedinger, K.R.: When and how often should worked examples be given to students? New results and a summary of the current state of research. In: Proceedings of the 30th Annual Conference of the Cognitive Science Society, pp. 2176–2181. Cog. Sci. Society, Austin (2008)Google Scholar
- 3.Renkl, A.: Atkinson. R.K.: Learning from worked-out examples and problem solving. In: Plass, J.L., Moreno, R., Brünken, R. (eds.) Cognitive Load Theory. Cambridge University Press, Cambridge (2010)Google Scholar
- 4.Isotani, S., Adams, D., Mayer, R.E., Durkin, K., Rittle-Johnson, B., McLaren, Bruce M.: Can erroneous examples help middle-school students learn decimals? In: Kloos, C.D., Gillet, D., Crespo García, R.M., Wild, F., Wolpers, M. (eds.) EC-TEL 2011. LNCS, vol. 6964, pp. 181–195. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23985-4_15CrossRefGoogle Scholar
- 7.McLaren, B.M., et al.: To err is human, to explain and correct is divine: a study of interactive erroneous examples with middle school math students. In: Ravenscroft, A., Lindstaedt, S., Kloos, C.D., Hernández-Leo, D. (eds.) EC-TEL 2012. LNCS, vol. 7563, pp. 222–235. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33263-0_18CrossRefGoogle Scholar
- 9.Baker, R.S., Corbett, A.T., Koedinger, K.R., Wagner, A.Z.: Off-task behavior in the cognitive tutor classroom: when students “game the system”. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 383–390 (2004)Google Scholar
- 12.Baker, R.S., D’Mello, S.K., Rodrigo, M.M.T., Graesser, A.C.: Better to be frustrated than bored: The incidence, persistence, and impact of learners’ cognitive–affective states during interactions with three different computer-based learning environments. Int. J. Hum.-Comput. Stud. 68(4), 223–241 (2010)CrossRefGoogle Scholar
- 14.Liu, Z., Pataranutaporn, V., Ocumpaugh, J., Baker, R.: Sequences of frustration and confusion, and learning. In: Educational Data Mining (2013)Google Scholar
- 15.Baker, R.S., et al.: Educational software features that encourage and discourage “gaming the system”. In: Proceedings of the 14th International Conference on Artificial Intelligence in Education, pp. 475–482 (2009)Google Scholar
- 18.Isotani, S., McLaren, B.M., Altman, M.: Towards intelligent tutoring with erroneous examples: a taxonomy of decimal misconceptions. In: Aleven, V., Kay, J., Mostow, J. (eds.) ITS 2010. LNCS, vol. 6095, pp. 346–348. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13437-1_66CrossRefGoogle Scholar
- 19.Lee, D.M.C., Rodrigo, M.M.T., d Baker, R.S.J., Sugay, J.O., Coronel, A.: Exploring the relationship between novice programmer confusion and achievement. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.-C. (eds.) ACII 2011. LNCS, vol. 6974, pp. 175–184. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24600-5_21CrossRefGoogle Scholar
- 20.Baker, R.S., Corbett, A.T., Wagner, A.Z.: Human classification of low-fidelity replays of student actions. In: Proceedings of the Educational Data Mining Workshop at the 8th International Conference on Intelligent Tutoring Systems, vol. 2002, pp. 29–36 (2006)Google Scholar
- 21.Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)Google Scholar
- 22.Cohen, W.W.: Fast effective rule induction. In: Twelfth International Conference on Machine Learning, pp. 115–123 (1995)Google Scholar