Automated Gaze-Based Identification of Students’ Strategies in Histogram Tasks through an Interpretable Mathematical Model and a Machine Learning Algorithm

Boels, Lonneke; Moreno-Esteva, Enrique Garcia; Bakker, Arthur; Drijvers, Paul

doi:10.1007/s40593-023-00368-9

Automated Gaze-Based Identification of Students’ Strategies in Histogram Tasks through an Interpretable Mathematical Model and a Machine Learning Algorithm

ARTICLE
Open access
Published: 22 September 2023

(2023)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Artificial Intelligence in Education Aims and scope Submit manuscript

Automated Gaze-Based Identification of Students’ Strategies in Histogram Tasks through an Interpretable Mathematical Model and a Machine Learning Algorithm

Download PDF

1096 Accesses
1 Citation
Explore all metrics

Abstract

As a first step toward automatic feedback based on students’ strategies for solving histogram tasks we investigated how strategy recognition can be automated based on students’ gazes. A previous study showed how students’ task-specific strategies can be inferred from their gazes. The research question addressed in the present article is how data science tools (interpretable mathematical models and machine learning analyses) can be used to automatically identify students’ task-specific strategies from students’ gazes on single histograms. We report on a study of cognitive behavior that uses data science methods to analyze its data. The study consisted of three phases: (1) using a supervised machine learning algorithm (MLA) that provided a baseline for the next step, (2) designing an interpretable mathematical model (IMM), and (3) comparing the results. For the first phase, we used random forest as a classification method implemented in a software package (Wolfram Research Mathematica, ‘Classify Function’) that automates many aspects of the data handling, including creating features and initially choosing the MLA for this classification. The results of the random forests (1) provided a baseline to which we compared the results of our IMM (2). The previous study revealed that students’ horizontal or vertical gaze patterns on the graph area were indicative of most students’ strategies on single histograms. The IMM captures these in a model. The MLA (1) performed well but is a black box. The IMM (2) is transparent, performed well, and is theoretically meaningful. The comparison (3) showed that the MLA and IMM identified the same task-solving strategies. The results allow for the future design of teacher dashboards that report which students use what strategy, or for immediate, personalized feedback during online learning, homework, or massive open online courses (MOOCs) through measuring eye movements, for example, with a webcam.

Gaze-Based Prediction of Students’ Understanding of Physics Line-Graphs: An Eye-Tracking-Data Based Machine-Learning Approach

Predicting Learner Answers Correctness Through Eye Movements with Random Forest

Understanding Student Success in Chemistry Using Gaze Tracking and Pupillometry

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The Challenge of Gaze-Based Strategy Identification in Statistics Education

Imagine students learning statistics on a laptop. They interpret histograms to solve a task. Assume that the webcam camera is good enough to track their eye movements when thinking about the task. It is imaginable that the eye movements of all students can be automatically recorded online^{Footnote 1} in the near future to document students’ strategies. With available techniques, it is in principle possible to give automated feedback on students’ task-specific solution strategies. In this imaginary situation, feedback on students’ strategies can even be given before students answer. We see our article as a first step toward using gaze data as a source in, for example, an intelligent tutoring system (ITS) in statistics education.

Although techniques are available, there are still several challenges regarding the use of gaze data as a source in statistics education, before an ITS can be considered. The first is the availability of gaze data, as the use of eye tracking in statistics education is rare (e.g., Strohmaier et al., 2020). Recent reviews of eye-tracking studies in mathematics education found only four studies in statistics education (one out of 33 included studies, Lilienthal & Schindler, 2019; three out of 161, Strohmaier et al., 2020).

The second challenge is that the current usage of gaze data often addresses general pedagogical themes (e.g., metacognitive skills; Lai et al., 2013) instead of task-specific strategies teachers in statistics education are interested in (sometimes called didactics or domain-specific pedagogy). Most studies investigating students’ strategies look at general strategies including planning and evaluation (e.g., Eivazi & Bednarik, 2010) or global scanning followed by local viewing (Van der Gijp et al., 2017). Other studies look at cognitive models such as visual working memory (e.g., Epelboim & Suppes, 2001). The number of studies that uncover task-specific strategies in mathematics in primary and secondary education (e.g., Lilienthal & Schindler, 2019; Strohmaier et al., 2020) and science (e.g., Garcia Moreno-Esteva et al. 2020; Klein et al., 2021; Kragten et al., 2015) is still relatively small but growing. For example, patterns in students’ gazes indicating strategies have already been found in mathematical domains such as numbers (Schindler et al., 2021), arithmetic (Green et al., 2007), fractions (Obersteiner & Tumpek, 2016), proportional reasoning (Shayan et al., 2017), area and perimeter (Shvarts, 2017), Cartesian coordinates (Chumachemko et al., 2014), geometry (Schindler & Lilienthal, 2019), trigonometry (Alberto et al., 2019), and functions (e.g., parabola; Shvarts & Abrahamson, 2019). For mathematics and statistics teachers, such strategies are important as they can reveal students’ knowledge of and deficiencies in this specific topic (cf. Gal, 1995).

Third, a challenge is that automation of strategy identification by using interpretable models or machine learning techniques in combination with gaze data, is even rarer in statistics education: Only one of the four studies in the previously mentioned review studies used a machine learning approach (Garcia Moreno-Esteva et al., 2016). None of these studies used an interpretable mathematical model. Our present study aims to address this third challenge by investigating how these two data science tools—an interpretable mathematical model and machine learning algorithms—can be used to automatically identify students’ strategies on histograms based on gaze data.

Fourth, although the use of gaze data in ITSs is not new, the majority of studies on ITSs that use gaze data seem to focus on general skills such as engagement (e.g., D’Mello et al., 2012). This is in line with a review of research articles on artificial intelligence in education (AIED) in which an independent cluster of recent eye-tracking articles emerged that “include ‘collaborative learning’, ‘engagement’, ‘video-based learning’, and ‘recommender system’” (Feng & Law, 2021, p. 293).

Fifth, many ITSs in mathematics and statistics education seem to focus on procedural knowledge—problems that can be solved by following a stepwise solving procedure such as solving a linear equation—although ITSs that focus on students’ task-specific strategies do exist, also in statistics education (e.g., Tacoma et al., 2019). To the best of our knowledge, none of these seem to use gaze data as a source.

That said, ITSs that use gaze data for identifying visual-based task-specific strategies, as far as we are aware, do not exist yet in statistics education. Before such a gaze-based ITS can be considered and developed, we not only need to be able to link students’ mathematical task-solving strategies to specific gaze patterns but also to automate the identification (or classification, as data scientists would say) of such strategies. In our previous, qualitative study, we inferred students’ strategies from their gaze data. In the current study, we concentrate on automatization through the research question: How can gaze data be used to automatically identify students’ task-specific strategies on single histograms?

The potential of automated identification of such strategies is to make large-scale, personalized feedback possible for online learning both in the initial stages of learning and during expertise development (Ashraf et al., 2018; Brunyé et al., 2019; Jarodzka et al., 2017; Hwang & Tu, 2021). This can make feedback in online courses or during homework more efficient and more accurate.

This article aims to show how the identification of students’ task-specific strategies on histograms can be automated. We expect that this work can nurture the dialogue between experts in the field of data science algorithms—more specifically experts regarding interpretable mathematical models (IMM) and machine learning algorithms (MLAs)—and educational researchers. IMM and MLA experts may be more interested in how the IMM or MLA was or could be tailored to the specific application. Educational researchers may be more interested in using an MLA as it is, as a black box, and wonder what it provides them and how well it works. The advantage of an IMM for educational researchers is that it is transparent in how it exactly came to its decisions for individuals. We think this article can fuel the dialogue between IMM and MLA experts and educational researchers to keep the boundaries between disciplines permeable. At such boundaries, exciting new research can emerge.

In this article, we developed an interpretable mathematical model (IMM) and compared its results with a machine learning algorithm (MLA). We used these two methods from data science along with theories and insights from psychology research and neurosciences (e.g., on eye tracking, what gaze data can and cannot tell us, and the sensorimotor system), theories and insights from mathematics and statistics education research (e.g., on averages and histograms), and bring this to the world of human–computer interaction (in which, for example, the usability of an IMM or MLA is important). This means that we sometimes need to bridge worlds in terms of terminology, expectations, and explanations.

Our study is in line with the call for research focusing on methods for using measures of micro-level learning processes—including gaze data (Harteis et al., 2018). For the specific topic of histograms, our study also provides the level of detail that Peebles and Cheng (2001) referred to: “From […] eye-movement studies it is argued that there is a missing level of detail in current task analytic models of graph-based reasoning.” (p. 1069). Yuan et al. (2019) showed that there is a need for searching for “visual cues that mediate the patterns that we can see in data, across visualization types and tasks” (p. 1).

Theoretical Background of the Tasks and the Use of Gaze Data

Estimating the Arithmetic Mean from Histograms

Developing students’ statistical literacy, reasoning, and thinking is an important goal of education (Ben-Zvi et al., 2017). Statistical literacy is especially important in the world of “big data” and alternative truths (Burrill, 2020). Most adults will be data consumers, making decisions based on data collected by others (Gal, 2002). Statistical data in tables are not always clear. Messages can be clearer if these data are presented in more aggregated forms in graphical representations—including dotplots, boxplots, and histograms—that stress some aspects of the data (e.g., variability) and leave out other information (e.g., the exact measurements). Students, however, find it difficult to correctly interpret histograms.

A review of students misinterpreting histograms revealed that many of their difficulties stem from not understanding the statistical key concept of data (Boels et al., 2019a, b, c). The key concept of data includes an understanding of what, how many, and how variables and their values are depicted in a histogram. Despite many carefully designed interventions to tackle misinterpretations (e.g., Kaplan et al., 2014), students’ difficulties with histograms remain (e.g., Cooper, 2018). We, therefore, decided to use eye tracking to study in depth how students interpret histograms (Boels et al., 2018, 2019a, 2022a, b).

Strengths and caveats in students’ knowledge can be revealed by asking them to estimate averages from data in different representations (e.g., histogram, dotplot, case-value plot; cf. Gal, 1995). Estimating the mean can be seen as a prerequisite for assessing variability, as the variation in data is compared to a measure of center (e.g., standard deviation from the mean). Furthermore, our students are familiar with the mean, but not so much with variability. Therefore, in a previous eye-tracking study, students were asked to estimate the mean from various—but univariate—statistical graphs in 25 items (e.g., Boels et al., 2022a, b). In the present article, we re-use gaze data from a subset of this previous study containing all five single histogram items.

Historical examples show that the mean has emerged from estimating representative values for a dataset through compensation and balance (Bakker & Gravemeijer, 2006). Students exhibit minimal difficulty in estimating the mean from case-value plots (Cai et al., 1999), unless zero is one of the measured values (Boels et al., 2022a, b). Most students know how to calculate the arithmetic mean from raw data (e.g., Konold & Pollatsek, 2004). In a study with various items—including finding the “average” allowance from a histogram—five approaches were found for solving the items: average as (1) mode, (2) algorithm, (3) reasonable, (4) midpoint, and (5) balancing point (Mokros & Russell, 1995). Students often (implicitly) use the mean of frequencies in a histogram (cf. Cooper, 2018). The latter is incorrect when applied to histograms but correct for finding the mean from a case-value plot, and can be seen as finding the horizontal line that makes all bars of equal height by using compensation.

The weighted estimation of the mean in a histogram is the balance or gravity point of the graph (e.g., Mokros & Russell, 1995). This mean can be found by taking the range or spread of the data in the histogram into account together with the height of the bars. For this approach, it is not necessary to read off frequencies on the vertical axis. We call this approach a histogram (interpretation) strategy or correct strategy. An estimation of the mean in a histogram with equal bin widths can also be computed by multiplying the frequency or percentage (height of the bar) with the middle value of that bar, adding the results over all bars, and dividing this by the sum of the frequencies. No students in the previous study used this approach. Instead, all that used a computational approach added all frequencies and divided this sum by the number of bars. This would be a correct strategy if the height of each bar was representing weight and the number of bars was the number of measured weights (as in a case-value plot). Therefore, this count-and-compute strategy is incorrect for histograms.

In our previous study, we found several strategies for estimating the mean from a histogram based on students’ visual search strategies (cf. Goldberg & Helfman, 2011) inferred from their gaze patterns (see the Empirical Background of the Re-Used Data section). A visual search strategy can be part of a task-specific strategy. People use these strategies to get “from an initial problem state to a desired goal state, without knowing exactly what actions are required to get there (Newell & Simon, 1972)” (Van Gog et al., 2005, p. 237). As the debate between Lawson (1990) and Sweller (1990) illustrates, there are different opinions on what strategies are. In our study on graph interpretation, students’ strategies typically consist of (1) visually searching for the relevant information, (2) making inferences based on this information, and in some cases (3) verifying the inference; see also the section Theoretical Interpretation of Students’ Gaze Patterns.

Given our focus on what eye tracking and data science tools (IMM, MLA) can provide to educational researchers, we now first discuss the theoretical background of using eye tracking. In the Research Approach section, we elaborate on the data science tools used (IMM and MLA).

Use of Gaze Data

There are multiple reasons for using gaze data to identify students’ strategies. First, eye-movement patterns (e.g., order of fixations^{Footnote 2} or saccades) are online, real-time measures that may allow for more adequate feedback than feedback on answers only. Moreover, feedback on strategies can be provided earlier during the problem-solving process (e.g., Gerard et al., 2015; Mitev et al., 2018), although strategy feedback can also be on answers (e.g., Tacoma et al., 2019). Second, low-accuracy eye tracking—for example, through webcams—is expected to be a standard option for computers in several years (e.g., Kok & Knoop-Van Campen, 2022), which would make it possible to give feedback to large groups of students. Third, gaze data are direct motor data that are almost impossible to manipulate. This makes measuring eye movements more reliable than, for example, thinking-aloud protocols (e.g., Van Gog et al., 2005). In addition, younger students, novices, and sometimes even experts find it difficult to articulate their thinking process, are sometimes not aware of their thinking (e.g., Green et al., 2007) or might respond to what they think the interviewer expects or what is easily accessible (e.g., Wilson, 1994).

The implicit assumption here is that eye movements reflect cognitive processes. Spivey and Dale (2011) state: “Our most frequent motor movements—eye movements—are sure to play an important role in our cognitive processes. […they] provide the experimenter with a special window into these cognitive processes.” (p. 551). It is indeed generally assumed that gaze data can provide evidence of conceptual actions, however with some caveats (e.g., Radford, 2010). First, the relationship between eye movements and cognitive processes is not straightforward (e.g.,Kok & Jarodzka, 2017; Russo, 2010). In addition, not every eye movement is part of a student’s strategy (e.g., Anderson et al., 2004; Schindler & Lilienthal, 2019). Furthermore, one could argue that students’ fixations on the screen do not indicate where they looked, as people also observe through their peripheral vision (Lai et al., 2013). Nevertheless, in our items, focused vision is needed for locating detailed information (e.g., locating a bar, or reading a specific number on the horizontal axis). As the fovea has the greatest acuity (sharpness; Wade & Tatler, 2011), locating a number on an axis is only possible with foveal vision, and peripheral vision most likely guides our attention to it and to the bars (cf. Kok & Jarodzka, 2017). Therefore, we can infer that the fixation on the screen is what the student is looking at.

Choosing what eye-movement measures to use is a methodological decision. According to a review study (Lai et al., 2013), the measures used most often were temporal (e.g., total fixation duration, time to first fixation, total reading time), followed by count (e.g., fixation count). The least used measures were spatial (e.g., scanpath, fixation position, order of AOIs). Goldberg and Helfman (2010) stated that “with appropriate task design and targeted analysis metrics, eye-tracking techniques can illuminate visual scanning patterns hidden by more traditional time and accuracy results” (p. 71). Scanpaths can reveal learning in more detail (Hyönä, 2010). Tai et al. (2006), therefore, advise using spatial measures such as scanpaths in problem-solving research.

Studies using students’ scanpaths for identifying strategies are rare, and often use the sequence of AOIs (e.g., Garcia Moreno-Esteva et al., 2018) or scanpaths that are aggregated over time and fixations (e.g., in heatmaps, Schindler et al., 2021). Up to now, scanpaths mostly require qualitative inspection or analysis of the eye-movement data (e.g.,Alemdag & Cagiltay, 2018; Susac et al., 2014)—especially when looking for task-specific strategies. Figure 1 provides an example of such a scanpath (a sequence of fixations and saccades). Qualitative analysis is both time-consuming and harder to objectify. In this article, we, therefore, use the raw scanpath data to identify students’ strategies.

In studies using angles and direction of saccades in educational settings (e.g., Dewhurst et al., 2018), scanpaths are often compared on multiple or all AOIs.^{Footnote 3} In our previous, qualitative study, we took a new approach in using the perceptual form (e.g., vertical gaze pattern) of the gazes on one AOI only (graph area)—the one that was found particularly relevant for students’ task-specific strategies (see also the following section, Inferring an Attentional Anchor from Gaze Data). This perceptual form consists of angles and direction of saccades that are roughly aligned. So far, we have not found any other study in education that uses the alignment of saccades. For more details, see the Research Approach section (Students’ Strategies). A possible advantage of looking at saccades over fixations or order of AOIs is that it may be less sensitive to spatial offsets (e.g., Jarodzka et al., 2010).

Inferring an Attentional Anchor from Gaze Data

For our theoretical interpretation of the perceptual form of students’ gaze patterns on the graph area (horizontal or vertical line segments), we draw upon insights from theories on enactivism and embodied cognition. According to these theories, cognition arises from interaction with the environment (e.g., Rowlands, 2010). The focus of an actor’s interaction with this environment is called an attentional anchor (AA) (Hutto & Sánchez-García, 2015). An AA is “a real or imagined object, area, or other […] behavior of the perceptual manifold that emerges to facilitate motor-action coordination” (Abrahamson & Sánchez-García, 2016, p. 203). Other behavior of the perceptual manifold, for example, includes students gesturing a horizontal line when explaining how they made all bars equally high in a case-value plot strategy. The AAs found in our previous research (Boels et al., 2022a) facilitated students’ imagined actions (strategies for finding the mean)—regardless of the strategies’ correctness.

Examples of AAs in motor action can be found in research on high school trigonometry where students coordinate the movement of the left hand to describe a circle and their right hand to describe a sine graph (Alberto et al., 2019). Another example is the manipulation of two bars that are proportional to each other, see Fig. 1 (e.g., Shayan et al., 2017). Students needed to keep the bars green, which occurred when the bars had a fixed ratio of, for example, 1: 2 (unknown to the students). They dragged both bars up to find various points where both are green. Students had different strategies for finding these points. In one strategy, gaze fixation is on the right-hand bar in the middle, which is mathematically relevant, as this bar is twice as high as the left-hand bar (Fig. 1). As this imagined triangle emerges to facilitate the coordination of the motor action, it is an example of an AA.

Theoretical Interpretation of Students’ Gaze Patterns

Although enactivism often assumes manipulation of an environment, there are indications that people’s sensorimotor systems are also activated in situations without physical manipulation (e.g.,Fabbri et al., 2016; Lakoff & Núñez, 2000; Molenberghs et al., 2012). In the retrospective stimulated recall interviews, students talked about the graphs as if manipulation were possible. For example, student L10 refers to chopping and flattening all bars (hence, using compensation, Bakker & Gravemeijer, 2006), see the excerpts below. This student describes sensorimotor actions, namely: breaking up the longest bar into pieces that are then divided over the shorter bars, resulting in a horizontal line along the top of the now equally high bars. This would be a correct strategy for finding the arithmetical mean in a different type of graph (namely, a case-value plot, e.g.,Cai et al., 1999; Yuan et al., 2019). An imaginary horizontal line segment is used for coordinating this imagined action. Gaze data show this imaginary segment in the form of a stable scanpath indicating the focus of interaction of this student, see Fig. 2. We, therefore, interpret this segment as an AA. Gazes on Item06 indicate that this AA was also visible before Item20.

L10: I looked at the graph itself [Item06] first and then I kind of looked at the axes, how is it constructed, and then I looked at the question and then I looked again at the frequency, how to group it. That was it in my opinion.

R1: And were you doing that here the same way you did with those other [previous] questions? Chop it into pieces?

L10: Yes

[…]

R1: You said five here [Item20].

L10:Yes, because I thought the weight would be on the left side. So, if you flattened it all out, between 4 and 6 would be the imaginary [horizontal] line.

Research Approach

In a previous study, we collected and qualitatively analyzed students’ gaze and stimulated recall data and classified these into groups (Table 1) of students using the same strategy (Boels et al., 2022a). The present study consists of three phases: (1) analysis of gaze data on one AOI (that contains the stable scanpath) through a random forest MLA; (2) construction of a separate interpretable mathematical model (IMM) based on the gaze data and using insights from qualitative research; and (3) comparison of the results of the MLA with the IMM and with the results of the previous qualitative study, as the MLA is a baseline to compare our IMM to. The most important information of the previous study is presented in the following section. Next, the first two phases of the present study are explained in more detail. As we aim to bridge different research fields we also provide background information on the data science tools used and the considerations that have guided our choices, A comparison of the results is made in the section Results of Applying an MLA and IMM.

Table 1 Strategies, percentage of trials (correct strategy in bold), N = 50 per item (see also Boels et al., 2022a)

Automated Gaze-Based Identification of Students’ Strategies in Histogram Tasks through an Interpretable Mathematical Model and a Machine Learning Algorithm

Abstract

Similar content being viewed by others

Gaze-Based Prediction of Students’ Understanding of Physics Line-Graphs: An Eye-Tracking-Data Based Machine-Learning Approach

Predicting Learner Answers Correctness Through Eye Movements with Random Forest

Understanding Student Success in Chemistry Using Gaze Tracking and Pupillometry

The Challenge of Gaze-Based Strategy Identification in Statistics Education

Theoretical Background of the Tasks and the Use of Gaze Data

Estimating the Arithmetic Mean from Histograms

Use of Gaze Data

Inferring an Attentional Anchor from Gaze Data

Theoretical Interpretation of Students’ Gaze Patterns

Research Approach

Empirical Background of the Re-Used Data

Participants

Eye-Tracking Apparatus

Tasks

Students’ Strategies

Interpretable Mathematical Model and Machine Learning Algorithm

Considerations for Analyzing Eye Movements with a Machine Learning Algorithm

Considerations for the Construction of an Interpretable Mathematical Model

Construction of an Interpretable Mathematical Model

Supervised Machine Learning Algorithm

Random Forest Machine Learning Analysis

Methodological Evaluation Criteria

Results of Applying an MLA and IMM

Machine Learning Algorithm Results

Supervised Machine Learning with Students’ Answers’ Correctness

Supervised Machine Learning with Students’ Strategies’ Correctness

Results from the Interpretable Mathematical Model

Comparison of IMM and Random Forest MLA results

Conclusions and Discussion

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Electronic supplementary material

Appendices

Appendix

Mathematica Code used to find the Best Method for the Data

Overview of the Results for Sensitivity and Specificity

Confusion Matrices of Answers

Confusion Matrices of Strategies

Results of the Analyses when Using one Item as Training Item

Comparison of the Random Forest MLA and IMM results

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation