Self-regulated learning (SRL) skills are considered essential in contemporary society as they provide a foundation for successful lifelong learning (Klug et al., 2011). SRL is often described as a dynamic process in which learners actively set their learning goals, and select, monitor and control their learning strategies, cognitive resources, motivation, and behaviour to optimise their learning process and achieve desired outcomes (Winne & Hadwin, 1998; Winne & Perry, 2000; Zimmerman, 1990). Self-regulated learners thus engage different cognitive processes (e.g., reading, re-reading and elaborating) to accomplish a learning task, and also different metacognitive processes (e.g., orientation, planning and monitoring) to plan and oversee their learning (Winne & Hadwin, 1998; Winne, 2013; Winne et al., 2010). Researchers have documented that self-regulated learners often outperform their colleagues who do not sufficiently and productively engage in SRL (e.g., Bannert & Reimann, 2012; Azevedo et al., 2008). To improve understanding of SRL processes and provide adequate support to learners who need to boost their SRL skills, researchers have attempted to measure SRL using different methods. The traditional methods involved using self-reports (e.g., surveys) to collect (Pintrich & et al., 1991), and manual analytical work (e.g., coding of screen recordings) (Zhang & Quintana, 2012) to analyse data about SRL. These methods, however, could not often provide a complete picture of the self-regulation process (Järvelä & Bannert, 2019) and failing to capture many of the SRL processes that have been theorised to occur in a learning session (Winne & Hadwin, 1998; Winne, 2004).

In recent years, SRL researchers have proposed that self-regulation should be studied from the perspective of the events that learners generate rather than from the perspective of learning experiences that learners report themselves (Bannert et al., 2014). The event-based analytical approach is hence focused on identifying occurrences of specific processes of SRL learners engage in, e.g., planning, monitoring or evaluation (Siadaty et al., 2016a), in an unobtrusive way, e.g., by collecting and analysing authentic trace data recorded in digital learning environments (Winne et al., 2010). The most common type of trace data used in previous studies were timestamped events representing learner’s navigation across different web pages in a learning environment (Kinnebrew et al., 2013). In the current study, we refer to these as navigational logs. However, navigational logs are often not fine-grained enough to reliably study SRL processes (Järvelä & Bannert, 2019). For instance, a navigational log showing that a learner opened a page with a textbook chapter and stayed on that page for some time does not reveal what operations a learner performed on the chapter’s text (e.g., re-reading a sentence, highlighting a key phrase). The lack of such fine-grained information, in turn, is a major obstacle for researchers who aim to reliably infer nuanced SRL processes, e.g., elaborating, monitoring or evaluating (Bannert, 2007). The problem with granularity and reliability of SRL measurement could be addressed by introducing new data channels (Winne, 2010). For instance, by knowing the positions of learner’s eye fixations throughout the learning session, researchers may infer which parts of learning content learners were operating on (e.g., introductory paragraph, table of content and a list of learning goals) or what tools they used for that purpose (e.g., a timer and note taking tool).

Given recent developments of technologies for capturing user data across multiple channels, researchers can collect and analyse learner trace data that are richer and more informative than navigational logs. These new data channels include clickstream and keystroke data; eye-tracking data; brain activation, skin conductance, and other bio-physiological signals (Reimann et al., 2014; Järvelä & Bannert, 2019; Azevedo & Gašević, 2019). Data collected from multiple channels can be simultaneously indicative of different cognitive, metacognitive and affective processes. As Reimann et al. (2014) argued, the multi-channel data sets can provide researchers with resources for exploring learning processes that cross the ontological boundaries between the human body, the environment, and the mind. In these data-sets, the learning process can be modelled as streams of ”events”, which more closely aligns to the complex and dynamic nature of the learning process than observing learning as a single data stream.

In spite of documented importance of SRL skills for an independent lifelong learning and promises of analysing multi-channel data to deepen the understanding of SRL processes, there is still limited research in this area (but see Bernacki et al. (2012), Hörmann and Bannert (2016), Lali et al. (2014), Trevors et al. (2014), Saint et al. (2021), and Fan et al. (2020)). Adding to this growing body of research, we conducted a study to examine the extent to which the measurement of SRL can be enhanced by enriching navigational log data with the data collected using peripheral computer devices (i.e., clickstream data and keystrokes) and eye-tracking devices (i.e., eye-gaze positions). To that end, we developed a trace parser - a set of computational rules - for converting raw multi-channel data into SRL processes. The trace parser includes two main components, the action library and the process library. We thus detected and examined the eight groups of SRL processes: four groups of metacognitive and four groups of cognitive processes, including Orientation, Planning, Monitoring, Evaluation, First-reading, Re-reading, Elaborating and Organising. We then utilised the process mining analytical approach to investigate whether the addition of peripheral and eye-tracking data channels to navigational logs can provide new information about theorised SRL processes, not only in terms of their number, but also in terms of their duration and temporal occurrence. Our findings suggest that, upon adding new data channels to the analysis, we were able to detect cognitive and metacognitive processes that are central to SRL and that could not be previously detected throughout the learning session with navigational log data alone.


We first define the key concepts used throughout the paper, following the definitions and operationalisations proposed in the previous literature (Winne, 2014; Siadaty et al., 2016a; 2016b). The key concepts in this article include learning actions, SRL processes, and SRL process maps (Fig. 1). Here, we also provide a running example together with Fig. 1 to illustrate how a learner’s learning process could be analysed in this paper. First of all, learning actions are determined based on the occurrences of learning events recorded in raw trace data, e.g., a learner’s click to create or edit a note during a learning session is indicative of a NOTE_EDITING action. Then, the sequences of learning actions are mapped to SRL processes, such as orientation, planning or monitoring (Siadaty et al., 2016a; 2016b; Saint et al., 2020a). For example, imagine a learner, at the outset of a learning task, reading the page with the learning goals, then creating a note in a note-taking tool, and then continuing to read the remaining learning goals. For this learner, we will obtain the following sequence of actions: from LEARNING_GOAL to NOTE_EDITING back to LEARNING_GOAL. This sequence will further be mapped to the SRL processes of Orientation or Planning based on theoretical propositions in SRL models (Winne and Hadwin, 1998; Zimmerman, 2000). The SRL process maps are then created based on temporal co-occurrence between SRL processes. The two processes temporary co-occurred and transitioned between each other if they both were detected within the predefined time window. For example, a learner’s overall learning process could be represented as a SRL process map as given in Fig. 1 (right) showing that this learner started with the Orientation and Planning processes then engaged with other processes such as Elaboration and Monitoring. We used the process mining analytical methods to create and analyse such SRL process maps.

Fig. 1
figure 1

Relevant terminologies and their relations in this study

Detecting learning actions and SRL processes

Over the recent decade, researchers have begun utilising additional data channels that provide more information than simple navigational logs to examine complex SRL processes (Bernacki et al., 2012; Bernacki et al., 2013; Hörmann & Bannert, 2016; Lali et al., 2014; Trevors et al., 2016; Saint et al., 2021; Fan et al., 2020). For instance, peripheral data such as mouse clicks and movements, keystrokes, and window scrolling are proposed as an alternative method to unobtrusively collect SRL data (Hörmann & Bannert, 2016). Hörmann & Bannert (2016) revealed that pauses in interaction between a learner and learning environment (i.e., periods without a mouse and a keyboard input) are associated with increased cognitive load of a learner. The authors also demonstrated that typing behaviour modelled from the peripheral data can predict performance and motivation. Even though recording peripheral data is usually straightforward, interpreting and labelling these data to reflect SRL processes can be very challenging (Hörmann & Bannert, 2016; Lali et al., 2014). In this article, we report on a novel approach we developed to automatically analyse and label these peripheral data to reflect SRL processes and enhance navigational logs.

Eye-tracking is another data channel that has attracted an increasing attention of SRL researchers (Taub et al., 2016; Trevors et al., 2016). This type of data is collected by using eye-tracking devices (e.g., Tobii, EyeTech, SmartEye) to (a) detect the point of gaze (i.e., the point where a user is looking at) or the eye motion and (b) capture gaze information in terms of fixations and saccades. There is an evidence that eye-tracking data can provide useful insights into different dimensions of cognition, metacognition, and affective states (Bondareva et al., 2013). In SRL studies, eye-tracking is often used to analyse the points of learners’ attention. For example, with the help of eye-tracking data researchers have revealed what strategies learners use to coordinate information from multiple sources (Trevors et al., 2016). The strategy was operationalized as a transition between two areas that were fixated upon. Researchers have also harnessed eye-tracking data to detect specific gaze patterns and fixation behaviours on pre-defined areas of interest (AOI) and predict metacognitive monitoring and self-regulated learning (Taub et al., 2016). For example, Taub et al. (2016) have revealed that students with high prior content knowledge had significantly higher frequency of fixations on the content-notes pair than the low prior knowledge group did. This indicated that high prior knowledge students already had knowledge of the content and therefore spent more time fixating on the notes because they were taking, reviewing or re-organising notes. Such detailed differences in terms of learning events and SRL processes could not captured by raw navigational logs only, and therefore eye-tracking data can provide useful insights. However, using eye-tracking data to systematically map the sequences of actions to theorised SRL processes is still under-investigated. As an attempt to address this gap, we included and analysed in our study eye-tracking data and linked those to SRL processes using the process mapping approach.

We hence utilised the three data channels in the present study (Fig. 1): (i) navigational log data (logs capturing navigation between pages and time spent on the pages), (ii) peripheral data (mouse movements, scrolls, clicks, and keyboard strokes), and (iii) eye-tracking data (gaze points and fixations on the screen, timestamped in milliseconds). Despite an emerging research that relies upon some or all of the three data channels to study learning actions and detect SRL processes (Siadaty et al., 2016c; Saint et al., 2020a, 2020b; Matcha et al., 2019) (Fig. 1, steps 2 and 3), more research is needed to determine how a combination of data channels can improve measurement and provide a more complete understanding of SRL. With this gap in mind, we defined our first research question:

RQ1: When using a combination of data channels, how do the overall measurement results differ from using navigational log only, in terms of the distributions of identified SRL processes?

The answer to RQ1 could only address how many SRL processes are recognised upon adding peripheral and eye-tracking data to navigational log data. However, the answer to this research question would not indicate if the enrichment of navigational log data with the other two channels has changed the identification of processes at certain points during a study session. That is, whether an SRL process recognised to happen at some point in time and to have a certain duration with the use of navigational log data would also be recognised at the same point in time and with the same duration when the other two data channels are added. While research on temporal and sequential dimensions of SRL has gained significant traction recently (Molenaar & Järvelä, 2014; Azevedo, 2014; Saint et al., 2020a), there has been little research that looks how different combinations of data channels can help identify SRL processes during a learning session. We aimed to address this gap guided by our second research question:

RQ2: To what extent, and which identifications of SRL process are mainly affected or refined by the addition of peripheral and eye-tracking data?

To answer this research question, we developed new metric that can provide the “percent of processes refined” in SRL processes identified across different combinations of data channels.

Analysis of SRL process maps

Recent approaches to data analysis aim to identify SRL process maps (Fig. 1, step 4) by using techniques such as sequence mining (Jovanović et al., 2017; Martinez et al., 2011), process mining (Saint et al., 2018; van den Beemt et al., 2018), and epistemic network analysis (Ahmad Uzir et al., 2019; Matcha et al., 2019). With the use of these techniques, researchers aim to analyse how SRL processes are temporally sequenced (Molenaar & Järvelä, 2014), what is the probability of transitions between SRL processes (Saint et al., 2020b), how temporal sequencing and probabilities of transitions of SRL process compare across different student groups (Saint et al., 2020b; Ahmad Uzir et al., 2020), and whether and to what extent temporal connections between SRL processes occur as theoretically hypothesised in the models of SRL (Bannert et al., 2014). For example, Bannert et al., (2014) applied a process mining technique (fuzzy miner algorithm) to analyse coded think aloud data (e.g., read, elaborate, plan and search) to generate SRL process maps and find how those maps differ between high and low performing learners. Matcha et al. (2019) compared process, sequence and network analytic approaches to demonstrate how and to what extent these different approaches influence the detection of learning strategies from trace data (Matcha et al., 2019). However, most approaches to the analysis of SRL processes have been based on a single data channel, and mostly relied upon navigational logs or think aloud data. Measurement and analysis of the SRL process maps using new and multi-channel data is still rare in the literature and has only recently received researchers’ attention (Taub et al., 2016; Azevedo & Gašević, 2019; Reimann, 2019). However, there is limited research that looks whether and how the temporal sequencing of SRL processes changes after the peripheral and eye-tracking data channels are added to navigational log data. To address this gap, we formulated our third research question:

RQ3: Whether and how does the temporal sequencing of the recognised SRL processes change with the introduction of additional data channels to navigational log data?

By answering these three research questions proposed in this paper, we aim to provide novel contributions to the field of SRL, mainly from a methodological point of view. This study proposes an analytic approach, which aims at extracting SRL processes from multi-channel data. Following the proposed approach, this study sheds new light on how, why and to what extent the measurement of SRL can benefit from the use of multi-channel data.



The study was conducted in a lab setting at a German university. A total of 36 learners (with an average age of 26.20 years, and standard deviation of 4.21) who used German as their first language were recruited as participants. The participants came from very diverse majors (more than 20 different study or degree programs), such as chemistry, computer science, and social sciences. Due to the initial equipment failure and low eye-tracking capture rate, the study did not manage to collect eye-tracking data for 10 participants. Also, one participate encountered a technical failure in the middle of the study, and his study was interrupted and restarted. Considering this may have affected his learning approach and SRL processes, we did not include his data in this study. Therefore, to compare the measurement of SRL across different data channels, we included 25 participants in this study.


This study used a pre-post design with a 45-minute learning session during which participants were asked to study three topics: 1) artificial intelligence, 2) differentiation in the classroom and 3) scaffolding of learning. The learning task was to integrate the three topics into an essay that described learning in school in 2035. A detailed task instruction and four learning goals were provided, along with a detailed rubric which essay scores would use to assess the essay. Within 45 minutes, the participants were asked to selectively learn from and read more than 30 web pages and around 6000 words, and write the essay of 300-400 words. As such, the task was intentionally challenging to stimulate the participants to use SRL skills and tools provided in the learning environment (such as the note taking tool and timer) to complete the task.

Before the study began, the experimenters had introduced the study requirements, guided the participants to complete the pre-test on site, got them familiar with the learning environment, and performed the eye-tracking calibration. After the participants finished the whole session, the experimenters asked them to complete the post-test and transfer-test, to measure their learning outcomes.

Learning environment

The learning environment (see Fig. 2) with five areas of interest (AOI) was built. The AOI zones included the catalogue zone on the left, the reading and writing zones in the middle, and the note taking and timer zones on the right. To ensure the reliability of eye-tracking analysis, learners were not allowed to adjust the size of different zones or close any section. The learners used a personal computer that also collected the data from the following devices: Screen-based eye-trackers (Tobii Pro Spectrum TX300), webcams with microphones, keyboard, and mouse. Data were collected on the computer using the iMotions software system, which synchronised multi-channel data with a unified timeline.

Fig. 2
figure 2

Learning environment (AOI) and iMotions system (synchronising multi-channel data)

Data channels

Three data channels were included in this study: 1) Navigational log data, which stored simple navigational log data and time spent on pages; 2) Peripheral data, which stored data about mouse movements, mouse clicks on pages, mouse scrolls, and keyboard strokes; and 3) Eye-tracking data, which was sampled at 300 Hz and consisted of fixations, saccades, gaze points, and pupil size. Since our research questions focused on the improvement of analysis of SRL by adding new data channels (instead of by using individual data channels separately), we gradually combined the data channels: (i) nav_only – navigational log data only, (ii) enhanced_log – included navigational log data and peripheral data, and (iii) log+eye-tracking – included the navigational log data, peripheral data, and eye-tracking data.

SRL Measurement and analysis protocol

Based on the framework proposed by Siadaty et al. (2016c), we developed a protocol for measurement of SRL processes. The protocol contained i) a theoretical framework and a SRL coding scheme where the scheme was defined in the form of rules for identification of SRL processes (e.g., orientation and monitoring) and ii) a trace parser which converted raw log data into learning events – i.e., “event-ized” log data – comprising an action library and a process library. This trace parsing process was separately conducted based on three synchronised and gradually combined data channels (i.e., nav_only, enhanced_log and log+eye-tracking) as shown in the scheme in Fig. 3.

Fig. 3
figure 3

The trace parsing process based on three gradually combined data channels

Theoretical framework and coding scheme

Bannert’s (2007) theoretical framework characterises hypermedia learning into the major categories: metacognition, cognition, and motivation. This theoretical framework informed the development of the coding scheme used for manual analysis of think aloud data (Bannert, 2007; Engelmann and Bannert, 2019; Sonnenberg & Bannert, 2015), and here we adapted the framework to analyse trace data. The Metacognition category included the subcategories for Orientation, Goal specification, Planning, Searching for information, Judgement of information relevance, Evaluating goal attainment, and Monitoring and regulation. The Cognition category contained subcategories for Reading, Repeating information, and subcategories for deeper processing, including, Elaboration and Organisation of information. The main category of Motivation included all positive and negative utterances on the Task, the Situation, or the Ability (Sonnenberg & Bannert, 2015).

Because this theoretical framework relies heavily on verbal expression data for the definition and coding of Motivation (for example, “I find it difficult to perform well in this task”), it was difficult to identify occurrences of Motivation based on trace data. Therefore, we focused on the first two main categories of the Bannert’s framework in this study: Metacognition and Cognition.

To analyse the temporal structure of metacognitive and cognitive events, we simplified the coding scheme to keep the output comparable to other studies (e.g., Engelmann and Bannert (2019) and Sonnenberg and Bannert (2015)) and divided Cognition into Low_Cognition and High_Cognition. Distinguishing between simple low-level reading events and relatively high-level cognitive events such as elaboration helped us understand the SRL process of learners. We also considered the difference of coding based on think aloud and trace data (navigational logs, peripheral data, and eye-tracking), and defined our theoretical framework and coding scheme as shown in Table 1. Once the learning actions were labelled using the action library, the SRL processes were detected based on the process library (Table 2) where each process was created to map to the eight cognitive and metacognitive processes shown in Table 1.

Table 1 Theoretical framework that informed the coding scheme used for analysis of data
Table 2 The process library for detection of SRL processes from action labels

There were two primary components of the trace parser: action library – responsible for labelling raw log data with meaningful learning actions; and process library – responsible for detecting SRL processes based on the action sequences (as shown in Fig. 3).

Action library

In this study, we defined 12 learning actions, and labelled the multi-channel data into these actions (as shown in Fig. 1). These actions included TASK_INSTRUCTION; LEARNING_GOAL, RELEVANT_READING, RELEVANT_RE-READING, IRRELEVANT_READING, IRRELEVANT_RE-READING, NAVIGATION, WRITE_ESSAY, NOTE_EDITING, NOTE_READING, TIMER, and OFF_TASK. To offer additional details, we provide the definitions of actions and a detailed description of the labelling process along with the documentation of relevant technical issues in the Supplemental Document.

Process library

In order to build a reliable and valid process library, we had initially sourced SRL processes based on Bannert’s SRL coding scheme (Bannert, 2007), and then identified other processes from several related papers (e.g., Siadaty et al. (2016c) and Kizilcec et al. (2017)). Based on these previous studies, a hierarchical process library was proposed, as shown in Table 2. Based on the definitions of these subcategories in our theoretical framework, we constructed SRL processes that could be mapped to the eight subcategories. For example, a three-step action sequence such as “RELEVANT_READING to NOTE_EDITING back to RELEVANT_READING” (HC.O.3 in Table 2) represents learners taking notes while reading some relevant content. This was considered an Organisation process which as a high_cognition process is different from merely RELEVANT_READING (LC.F.1, as a low_cognition process). If one event recorded in any of the three data channels did not belong to any action sequence in Table 2, this event was labelled as No_Process and was not included in the subsequent analysis.

Validity of the measurement protocol

Even though using trace-based protocols to measuring SRL is becoming more popular in recent years, it is important to ensure such protocols or interpretations are valid (Winne, 2020). Samuel Messick defined validity as the “integrated evaluative judgement of degree to which empirical evidence and theoretical rationale supports the adequacy and appropriateness of inferences and actions based on test scores” (Messick, 1994, p. 6). Therefore, in this study, we followed a validation approach that combines theory-driven and data-driven perspectives to ensure the validity of interpretations of SRL processes extracted from trace-data (Fan et al., 2022).

Theory-driven perspective

By grounding our analysis in the theoretical framework (Table 1), we implemented different ways to enhance the validity of our trace parser. First, we conducted multiple rounds of in-depth brainstorming based on our theoretical framework to construct the process library. Those involved in the brainstorming included the researcher who developed the theoretical framework, the experimenter who understood the learning process in detail, the designer of the learning environments, and an experienced researcher who was familiar with SRL process construction. During the brainstorming sessions, we discussed how and to what extent the SRL processes reflected the categories from our theoretical framework, the construction of the SRL processes, the step length of the SRL processes (i.e., two-step process as A to B, or three-step process as A to B to C), and the possible interpretations of processes. For example, during our discussion, a three-step process (“IRRELEVANT_READING to LEARNING_GOAL to RELEVANT_READING”) was proposed by one researcher to map it to the Monitoring process based on our theoretical framework. This three-step process indicates that by checking learning goals during or after reading certain irrelevant content, learners monitored the reading process and moved on to read other relevant content. However, another researcher added his interpretation that learners may go back and check the learning goal during their reading the irrelevant or relevant content which should all be considered as indication of Monitoring process based on the definition of Monitoring (see Table 1). Therefore, we revised this three-step process into “(IR)RELEVANT_READING to LEARNING_GOAL to (IR)RELEVANT_READING” (see MC.M.4 in Table 2). We also presented and discussed the measurement protocol with three senior researchers and theorists as experts review to ensure the validity of the interpretations.

Data-driven perspective

When we drafted the first version of the process library, we tested the detection of SRL processes on three data channels, and then visualised and analysed these processes on the timeline of a learning session to identify unreasonable processes. For example, we found some Monitoring processes such as “RELEVANT_READING to TIMER to RELEVANT_READING” were unreasonably long (sometimes 2 minutes). The monitoring of time should be a relatively very short process (e.g., a quick glimpse at the timer can be shorter than 1 second). This led us to re-define some of the processes. For example, we re-defined the time monitoring process, from a three-step process (“RELEVANT_READING to TIMER to RELEVANT_READING”) to a one-step process (“TIMER”), because only the TIMER action in the previous three-step process represented Monitoring.

In order to improve the validity of our trace-based measurement protocol, we also collected think aloud data and used it to triangulate with trace data. Although think aloud method has its own limitations (Young, 2005; Fan et al., 2022), the interpretations based on think aloud codes are still considered a relatively more valid measurement than the use of self-report survey data (Bannert and Mengelkamp, 2013; Veenman, 2007; Bannert, 2007; Greene & Azevedo, 2010; Fan et al., 2022). Therefore, we considered think aloud codes as “reference points” in this study to test the validity of the trace-based measurement protocol. The experimenters first led a training to familiarise participants with the think aloud procedure before the experiment. During the experiment, the experimenter ensured the learners continuously kept thinking aloud by providing prompts if there was a long period of silence. In the coding stage, the audio recordings of participants’ think aloud were segmented, transcribed and coded into SRL processes by well-trained coders based on a previously developed coding scheme (Bannert, 2007; Molenaar et al., 2011) which shares the same theoretical framework with our trace-based measurement protocol. We used the think aloud codes to help us detect more interpretable and meaningful action sequences. For example, if one action sequence “A to B” frequently aligned with the think aloud code Monitoring and rarely aligned with other think aloud codes, we would conclude that this action sequence could be interpreted as the SRL process Monitoring.

These data-driven analyses, together with theory-driven perspective, allowed us to further improve the validity of the measurement protocol and resulted the final version of process library as shown in Table 2. In order to evaluate the validity degree of the final process library, we aligned the trace-based measurement results with the think aloud measurement results using a synchronised timeline and calculated measurement agreement (i.e., same SRL processes were detected at the same time slot). We achieved a 65.73% agreement between the two methods, which means that in nearly two-thirds of the time period where both methods detected SRL processes, the think aloud and trace protocols measured the same SRL process (e.g., both detected Monitoring process between minute 1 and minute 2). This results indicated that the interpretations in our trace-based measurement protocol are valid to a certain extent when considering think aloud as a “reference point”. The full description of our validation process could be found in Fan et al. (2022).

Data analysis

We obtained (i) the action labels from the raw data and (ii) SRL processes from the sequence of actions using the trace parser. In order to examine how the detection of SRL processes differ from using navigational log only and using combinations of data channels (RQ1), we conducted a descriptive statistical analysis to calculate the proportions of frequency and duration of each SRL process. Based on the measurement results (as demonstrated in the SRL process tracks in Fig. 3), the frequency of each SRL process for each learner was calculated based on how many occurrences of such SRL processes were identified (e.g., 10 occurrences of MC.M were identified during the participation of one learner), and the duration of each SRL process for each learner was calculated based on the sum of time duration for all occurrences of such SRL processes (e.g., a sum duration of 35 seconds for 10 MC.M occurrences). Based on the definition of the learning actions (see the Supplemental Document), the duration of each event can be calculated. For example, the NOTE_EDITING action may start with a mouse click to put the cursor in the note tool, and this action is considered to be going on (such as continuously typing on the keyboard) for several seconds or even minutes, and it ended when the mouse cursor or eye gaze was moved back to the reading zone. As the proportions of frequency and duration data were not normally distributed, we report median values along with 25th and 75th percentile values. In order to compare the measurement results based on three different data channels and test whether the proportion of SRL processes have significant differences in these results, we also conducted a Friedman test followed by post hoc Wilcoxon signed rank tests for the pair-wise comparison (with Bonferroni correction).

In order to better understand which SRL process were mainly affected or refined by adding new data channels from a temporal manner (RQ2), we aligned the measurement results based on different data channels on the same timeline. Then, we defined a new metric, the “percent of processes refined”, to represent the proportion of time for the different SRL processes were affected or refined when adding new data channels. The percent of processes refined was calculated by dividing the total duration of refined SRL processes by the total duration of refined and unrefined SRL processes. For example, the percent of processes refined between Enhanced_log and Nav_Only of the example in Fig. 3 equals to dividing the duration of the refined process (from LC.F to HC.E which was caused by NOTE_EDITING) by the total duration on the timeline. The percent of processes refined between data channels can be interpreted as new information provided by new data sources, which lead to new actions being labelled and new SRL process being detected. This analysis provided information on how and to what extent the addition of peripheral and eye-tracking data could improve the detection of SRL processes and thus was used to address RQ1 and RQ2.

To reveal the change of temporal sequencing of the recognised SRL processes (RQ3), we applied the pMineR process mining technique to generate SRL process maps based on different data channels (Gatta et al., 2017). The pMineR technique has previously been used in the analysis of learning and time management strategies (Matcha et al., 2019; Ahmad Uzir et al., 2020) and SRL processes (Saint et al., 2020a). The pMineR technique produced three first-order Markov models (FOMMs) which were extracted as SRL process maps using i) navigational log data, ii) combined navigational and peripheral data (i.e., enhanced log data), and iii) combined navigational, peripheral, and eye-tracking data. Because the number of transitions that could be observed will produce an increase when adding new data channels, we used the transition probabilities for the comparison across data channels. The FOMMs provided the SRL process maps that included probabilities of transitions from one SRL process to another one for every learner, which remain comparable when exploring the temporal sequential relationships. A comparison of the three FOMMs was performed to answer our third research question.


RQ1: Comparison of the data channels regarding the detection of SRL processes

In total for all 25 participants, we obtained 2,658 rows of actions from the Nav_only channel, 4,971 rows of actions from the Enhanced_log channel, and 81,998 rows of actions from the Log+eye-tracking channel. Based on the analysis of the series of these actions, we detected median values of 63, 98, and 1680 SRL processes (based on Table 2) per participant from the Nav_only channel, Enhanced_log channel, and Log+eye-tracking channel, respectively. However, not only did the addition of new data channels increase the total and the median numbers of actions and processes, but it also affected the frequency and duration distribution of different SRL processes. As shown in Table 3, we calculated the median and 25th and 75th percentiles of each SRL process based on the frequency of occurrence and duration for the participants involved in the study. We also report the statistical comparison results using the Friedman test followed by Wilcoxon signed rank tests for the pair-wise comparison (with Bonferroni correction) in Table 3. As shown in Table 3, most processes we detected from the labelled actions were cognitive processes, including the low_cognition processes such as First-reading and the high_cognition processes such as Elaboration. Based on the Log+eye-tracking data channel, most learners spent about a half of their time on the low_cognition processes, and almost 30% of their learning time on the high_cognition processes, and around another 5% on the metacognitive processes.

Table 3 Descriptive statistics of the SRL processes detected from the multi-channel data: median(25th, 75th) frequency(%) and median(25th, 75th) duration(%)

The comparison of the median duration of each process across three data channels revealed that the proportion of various SRL processes on frequency and duration varied greatly. For example, as shown in Table 3, the differences in the median duration of many processes between the Nav_only data channel and the Enhanced_log data channel were quite stark. We detected many more Organisation processes (43.03% based on Enhanced_log compared to 0% based on Nav_only), and far fewer First-reading processes (19.14% based on Enhanced_log compared to 65.09% based on Nav_only). This is caused by more actions labelled as NOTE_EDITING and WRITE_ESSAY when the Enhanced_log data channel was used (mostly based on keyboard strokes), and more Organisation processes such as “RELEVANT_READING/IRRELEVANT_READING to NOTE_EDITING back to RELEVANT_READING/IRRELEVANT_READING” or “WRITE_ESSAY to NOTE_EDITING” detected when the Enhanced_log data channel was used.

The addition of eye-tracking data led to a new distribution of SRL processes. For example, the First-reading processes detected from the Log+eye-tracking data channel accounted for 49.86% of the whole learning sessions, which was shorter duration than that of the First-reading detected from the Nav_only data (65.09%) and longer than that of the First-reading detected from Enhanced_log (16.11%). The median value of the Organisation processes detected from the Log+eye-tracking data channel accounted for 4.58% of the duration of the whole sessions, which was much shorter than the duration of Organisation detected from the Enhanced_log data (43.03%) and longer than that of the Organisation detected from the Nav_only data (0%). The reason for this significant fluctuation was due to the inaccurately measured NOTE_EDITING actions in the Enhanced_log data channel. This means that without eye-tracking data showing when the learners went back to reading, the Organisation processes (such as RELEVANT_READING to NOTE_EDITING to RELEVANT_READING) can be unreasonably long. For example, one learner spent 10 minutes reading one page, and during this time slot, they spent 0.5 minutes to take notes. Instead of considering the whole 10 minutes as Organisation, with the help of eye-tracking data, we were able to find a specific time slot and detect it as Organisation accurately, and record the rest as First-reading or Re-reading.

More importantly, in comparison to the Nav_only and Enhanced_log data channels, the addition of the eye-tracking data allowed us to detect more frequently SRL processes such as Evaluation and Monitoring, even though these processes only took a tiny proportion of the whole learning sessions. For example, from the Log+eye-tracking data channel, we detected the median values of 0.93% (frequency-based) for the Evaluation processes (e.g., IRRELEVANT_READING to NOTE_READING) and 3.45% (frequency-based) for the Monitoring processes (e.g., checking the TIMER during reading), which can be hard to detect from the Nav_only and Enhanced_log data channels.

RQ 2: Mainly affected SRL processes when adding new data channels

Figure 4 presents timelines of the SRL processes of one of the learners (participant No. 25, abbreviated as P25 below) involved in the study for each of the three data channels. We use this as an example to illustrate the detection of SRL processes based on different data channels. As shown in Fig. 4 (Nav_only data channel), this participant first spent around 5 minutes reading the task instructions and the learning goals (i.e., Orientation, MC.O in red), then read (i.e., First-reading, LC.F in blue and Re-reading, LC.R in purple) most of the learning materials for around 30 minutes, and finally spent the last 10 minutes writing the essay (Elaboration, HC.E in pink). As shown in Fig. 4, we detected new or different processes from the Enhanced_log and Log+eye-tracking data channels. The addition of the peripheral data (Enhanced_log) allowed us to detect the Planning, MC.P (in light-brown) processes during the Orientation, MC.O processes at the beginning of the study session (the 2min long time slot from the 1st min to the 3rd min). The introduction of eye-tracking data allowed us to detect more occurrences of the Elaboration, HC.E process during the reading (the pink lines around the 30th min) and more occurrences of the Monitoring, MC.M (in light-green) process during essay writing (very narrow green lines in last 10 minutes).

Fig. 4
figure 4

SRL process measurement results based on three data channels for participant P25 as an example. The timelines for the three channels are merged. Legend: MC.O – Orientation; MC.P – Planning; MC.E – Evaluation; MC.M – Monitoring; LC.F – First-reading; LC.R – Re-reading; HC.E – Elaboration; HC.O – Organisation; NA – No_Process;

The results of the analysis for all 25 revealed 46.05% for the overall “percent of processes refined” between Nav_only and Enhanced_log and 29.35% for the overall “percent of processes refined” between Nav_only and Log+eye-tracking. We also detected 51.57% for the “percent of processes refined” between Enhanced_log and Log+eye-tracking. Here, the “percent of processes refined” between data channels showed that adding new data channels did indeed greatly influence and change the measurement results of SRL processes. However, not all changes can be interpreted as “improvement of measurement”, which means the changes when adding new data channels does not necessarily mean improved accuracy of measurement. In order to understand how new data channels influence the measurement results, we need to explain in details which SRL processes were measured differently and why.

Further analysis into the refined time slots revealed how and why we could improve the reliability of SRL process detection by adding new data channels. For instance, as shown in the left sub-figure in Fig. 5, 77.07% of all the refined processes were processes that changed from First-reading or Re-reading (based on the Nav_only data channel) to Organisation (based on Enhanced_log). This is because new action NOTE_EDITING was identified when the Enhanced_log data were used (mostly keystrokes). Therefore, some simple one-step processes that had been detected based on the Nav_only data were changed into multi-step processes based on the Enhanced_log data; for example, LC.F.1 (RELEVANT_READING) was changed into HC.O.3 (“RELEVANT_READING to NOTE_EDITING to RELEVANT_READING”). However, this disagreement between the Nav_only and Enhanced_log data channels cannot simply be understood as an improvement in the measurement of SRL processes, because this disagreement also includes the inaccuracy of the Enhanced_log data. The middle part of Fig. 5 shows a correction for the measurement by adding the eye-tracking data: a considerable proportion of the occurrences of the Organisation process (e.g., HC.O.3) were changed back to the occurrences of First-reading (e.g., LC.F.1 and LC.R.2).

Fig. 5
figure 5

Top five refined processes between data channels for all 25 participants. Legend: the definition of each code in this figure can be found in Table 2, for example process LC.F.1 is “RELEVANT_READING”

Further analysis of the refined processes with the focus on metacognition processes showed that more frequent and more reliable metacognition processes were detected based on the Log+eye-tracking data channel than with the use of the other two data channels, even though these refined processes only took a very small proportion. For example, some high_cognition processes (such as HC.O.3 and HC.E.1) based on the Enhanced_log data channel were relabelled as the meta_cognitive processes (mostly Monitoring processes, such as MC.M.1 – checking the timer during reading, or MC.M.6 – looking at the catalogue zone during writing) when the Log+eye-tracking channel is used. More importantly, the addition of the eye-tracking data also changed some SRL processes inside the range of metacognition (e.g., some Monitoring processes based on the Enhanced_log data changed to Orientation). For example, by adding the eye-tracking data, we identified new and short processes such as “NAVIGATION to NOTE_READING to NAVIGATION” (MC.O.4) inside a long process which was initially detected as “WRITE_ESSAY to TASK_INSTRUCTION/LEARNING_GOAL” (MC.M.5) based on the Enhanced_log data. This is because when learners navigated back to and checked the task instruction or the learning goal during writing, they also glanced over the catalogue with references to their notes to orientate which pages to read next in order to inform their writing and to meet the task requirements and learning goals. That is, students did this to check if they were writing according to the task requirements, and thus we consider this process to be Monitoring.

RQ3: Temporal sequential relationships

In this subsection, we report the results related to RQ3, which aimed to analyse the SRL process maps detected with the use of the three data channels.

Nav_only data channel

As shown in Fig. 6, the learners started their learning sessions with the Orientation (MC.O) and then continued with the First-reading (LC.F). After or during the reading, some of them engaged in the Elaboration (HC.E) process. Some transitions between processes were detected as highly probable with the use of the Nav_only data such as the transition from Evaluation (MC.E) to Re-reading (LC.R) (100%) and from Monitoring (MC.M) to First-reading (LC.F)(35%). We also found a high probability of the continuous use of some SRL processes such as transition probability of 92% from First-reading (LC.F) to First-reading (LC.F) and 78% from Orientation (MC.O) to Orientation (MC.O).

Fig. 6
figure 6

The first order Markov model of the temporal links between SRL processes detected based on the Nav_only data channel

Enhanced_log data channel

As shown in Fig. 7, we obtained a more complete process map from the Enhanced_log data channel compared to the map obtained based on the use of the Nav_only data. By adding the peripheral data such as mouse clicks and keyboard strokes, we were able to detect a new SRL process (Planning, MC.P) which, along with the Orientation (MC.O) and Monitoring processes (MC.M), formed critical transitions between these SRL processes in the early stages of learning. Fig. 7 shows a more complex SRL process map which involves First-reading (LC.F), Elaboration (HC.E) and Organisation (HC.O), which were all detected while the learners were reading the learning materials.

Fig. 7
figure 7

The first order Markov model of the temporal links between SRL processes detected based on the Enhanced_log data channel

With this new data channel, transitions with higher transition probabilities (compared to Nav_only results) were also found in this SRL process map. For instance, the probability of transition from Elaboration (HC.E) to Monitoring (MC.M) increased from 24% (based on Nav_only) to 44% (based on Enhanced_log). A two-way transition in Fig. 7 that showed an increased probability is the transition from First-reading (LC.F) to Organisation (HC.O) (increased from less than 1% based on Nav_only to 16% based on Enhanced_log) and then back to First-reading (LC.F) (increased from less than 1% based on Nav_only to 79% based on Enhanced_log). These new transitions with higher transition probabilities in Fig. 7 and their comparison to those shown in Fig. 6 are further considered in the Discussion section.

Log+eye-tracking data channel

The SRL process map detected from the Log+eye-tracking data channel is shown in Fig. 8. This SRL process map shows a clearer learning pathway: the learners first engaged in Orientation (MC.O) and Planning (MC.P), then started a series of cognitive processes such as First-reading (LC.F), Re-reading (LC.R), Elaboration (HC.E). and Organisation (HC.O). In addition, in this SRL process map, Monitoring (MC.M) and Evaluation (MC.E) interspersed more with the cognitive processes. It is worth noting that Monitoring (MC.M) and the transitions formed with other processes were under-detected in the first two data channels. Only when the eye-tracking data were added, we were able to detect Monitoring (MC.M) as an intermediate process linking most of the other SRL processes. This indicates the learners monitored their learning across the whole session of our study.

Fig. 8
figure 8

The first order Markov model of the temporal links between SRL processes detected based on the Log+eye-tracking data channel

The comparison of the SRL process map in Fig. 8 with the other two SRL process maps shown in Figs. 6 and 7 also revealed that some of the metacognitive processes such as Evaluation (MC.E) were connected differently with other SRL processes in the overall SRL process maps. For instance, in Fig. 8, Evaluation (MC.E) is placed between First-reading (LC.F) and Re-reading (LC.R), which formed transitions and constructed important SRL processes (evaluation between reading and re-reading). For example, the occurrences of the Evaluation (MC.E) process such as MC.E.2 (IRRELEVANT_READING to NOTE_READING) indicate the learners evaluated the relevance of reading materials and decided what to read or re-read next to meet the task expectations. These transitions and could only be detected with the use of the Log+eye-tracking data channel because many of these transitions could not be found with the analysis of the navigational log data only.


In this section, we discuss our findings with respect to the three research questions and outline the implications for research and practice.

RQ1 and RQ2: Granularity in the detection of SRL processes

Previous studies define and label learning actions mainly based on the navigational log data generated by learning environments (Saint et al., 2018; Matcha et al., 2019; Kizilcec et al., 2017). Only few studies have incorporated peripheral data (e.g., mouse clicks and movements, keystrokes, and window scrolling) (Hörmann and Bannert, 2016; Lali et al., 2014; Bernacki et al., 2012). Many typical events about learning actions have been labelled and investigated in the previous literature, such as actions with video playing, views of reading materials, start and submission of a quiz, or views of learning goals (Matcha et al., 2019; Saint et al., 2020a; Kizilcec et al., 2017). In order to detect and compare learning actions extracted based on new data channels such as peripheral and eye-tracking data, in this paper, we created an action library containing 12 actions which can be extracted from different data channels. With the help of peripheral data, we were able to detect new actions such as NOTE_EDITING (typing in the note taking tool) and NAVIGATION (mouse moving over the catalogue zone) from the Enhanced_log data channel. These new actions enabled us to capture more SRL processes. These actions are usually not logged and are neglected in the traditional learning environments or regular navigational log data, which are commonly used in learning analytics (Gašević et al., 2017). These limitations are inherent in how the log data are processed, for instance, when the learners want to check the timer they would just move their eye fixation on to the timer zone without any mouse click or movement. Therefore, with the help of eye-tracking data, we were able to detect additional actions such as NOTE_READING (fixation on note zone), TIMER (a quick glimpse at the timer) and another kind of NAVIGATION (fixation on the catalogue zone without moving the mouse over the catalogue zone) with the use of the Log+eye-tracking data channel. These new actions greatly improved the granularity of data about the learning process, and allowed us to capture many actions which were difficult to detect with the navigation or peripheral log data.

This improved action labelling led to more fine-grained SRL processes, which were mapped to the elements of theoretical models of SRL. Our findings revealed which SRL processes were mainly affected or refined by the addition of peripheral and eye-tracking data. In this study, in addition to detecting cognition processes such as First-reading and Organisation, we were also able to detect metacognition processes such as Orientation, and Monitoring. These SRL processes were traced and found in previous studies, mostly using trace data (Siadaty et al., 2016c; Saint et al., 2018; Kizilcec et al., 2017) or think aloud data (Sonnenberg & Bannert, 2016; 2019; Engelmann & Bannert, 2019; Molenaar et al., 2013). However, in this paper, we compared the contributions of different data channels in the detection of individual processes. By gradually enriching the original navigational log data with peripheral and eye-tracking data, we saw the improvements brought by these two data channels. In particular, we proposed a novel approach to using eye-tracking data to systematically detect SRL processes, which enabled us to detect new orientation processes such as “NAVIGATION to NOTE_EDITING to NAVIGATION” and new evaluation processes such as “IRRELEVANT_READING to NOTE_EDITING to IRRELEVANT_READING”. Although these SRL processes only accounted for a small proportion of the entire learning sessions and were usually not easily noticeable, their recognition allowed us to better unpack the learning process of the learners and help us better model the SRL processes.

Another, somewhat expected, insight from the comparison across different data channels is that adding new data channels such as peripheral data improved the measurement of SRL processes. However, sometimes this addition also posed new issues. As shown in the timeline-based comparison (see Section “RQ 2: Mainly affected SRL processes when adding new data channels”), the Enhanced_log data (included the peripheral data) relabelled a significant proportion of the actions (e.g., RELEVANT_READING) and processes (e.g., First-reading) into new learning actions (e.g., NOTE_EDITING) and new processes (e.g., Organisation). New data channels improved the richness of measurement for actions and processes, but they may have also posed a new reliability problem for the measurement. This reliability problem relates to how to determine when one action is terminated. For example, based on the Enhanced_log data channel, we considered learners showing the engagement into the Organisation or Elaboration processes if they took notes during reading and writing, and remained in the Organisation or Elaboration processes until a new action was logged. If one learner spent 5 minutes reading new materials after NOTE_EDITING without any mouse movement, then there would not be any event recorded in the Enhanced log data before this learner navigated to a new web page. This problem can not be easily solved unless a new data channel such as eye-tracking is involved. For instance, when learners moved their fixation from the note taking zone back on to the reading zone, the recorded eye-tracking data allowed us to detect a new action, which was labelled as RELEVANT_READING and this activity terminated the previously detected activity labelled as NOTE_EDITING. This is why we concluded that adding new data channels such as eye-tracking not only improved the richness but also increased the reliability of detection of learning actions and SRL processes in comparison to those that can be detected from the peripheral and navigational log data.

RQ3: Complexity and completeness of SRL processes

In order to answer the third research question, we used a process mining technique to compare the temporal and sequential relationships between SRL process detected with the use of the multi-channel data. As shown in Section “RQ3: Temporal sequential relationships”, by comparing the three SRL process maps extracted from the three data channels, we obtained more complex and complete SRL process maps when additional data channels were incorporated in the original navigational log data channel. For example, we found Organisation as a new high_cognition process was added to the SRL process map upon the inclusion of the peripheral data. We also found new transitions (i.e., First-reading to and fro Evaluation to and fro Re-reading) and detect Monitoring (MC.M) as an intermediate process linking most of the other SRL processes which indicates the learners monitored their learning across the whole session of our study. The analysis based on the multi-channel data offered some new insights into SRL process maps. For example, we found that the learners evaluated the relevance of reading materials and decided what to read or re-read next by reading their notes before navigating to other learning materials provided in the learning environment.

The SRL process map generated from the Log+eye-tracking data channel contained several SRL processes similar to the maps extracted in the previous studies which were based on the same coding scheme of SRL processes (Bannert, 2007) but were generated from think aloud data (Engelmann & Bannert, 2019; Bannert et al., 2014; Sonnenberg & Bannert, 2016). For instance, Engelmann and Bannert (2019) found a typical SRL process map to be composed of the Read to and fro Evaluation to Repeat transitions for the learners supported by metacognitive prompts. This is also detected in our SRL process map as the First-reading to and fro Evaluation to and fro Re-reading transitions. Another similar finding is that Monitoring was identified as an intermediate SRL process linking many other processes such as reading, elaboration, and repeating by Bannert et al. (2014) based on the analysis of think aloud data; this was also found in our study based on the Log+eye-tracking data channel. Analysis of SRL processes using think aloud data has been shown not only to offer a reliable measurement but also to provide deeper insights into the learner’s regulatory processes than questionnaire-based data (Bannert & Mengelkamp, 2013; Bannert et al., 2014; Sonnenberg & Bannert, 2015). In this study, we found that using multi-channel trace data can also be a more reliable measurement approach and provide deeper insights into SRL processes than using navigational log data only. It is worth noting that transitions of processes such as Re-reading to Evaluation, which are not frequently detected based on think aloud data (Engelmann & Bannert, 2019), are captured in this study with the Log+eye-tracking data channel.

The above comparison between our findings and the findings of the previous studies offer strong parallels and suggest a potential agreement between the multi-channel trace data used in the current study and think aloud data used in the studies reported in the literature. However, cross-validation of the measurement of SRL processes based on multi-channel trace data and results based on think aloud data remains an open research question and promising direction for future studies.

Implications for research and practice

From the methodological point of view, we followed a learning analytic approach and a measurement protocol for self-regulated learning, which was originally proposed by (Siadaty et al., 2016c). The detailed action and process libraries can be used as references and serve as practical guidelines for researchers who are interested in measuring and modelling SRL based on multi-channel data. This measurement protocol can be applied in any technology-enhanced learning environment (Siadaty et al., 2016c); however, definitions and detection of specific actions and processes require adjustments depending on the learning environment and tasks that are used for data collection. For example, researchers need to consider different areas of interest in their learning environment if they are using eye-tracking, and develop rules for mapping gaze patterns to relevant SRL processes. Both of these activities can be very challenging and time-consuming (Hörmann & Bannert, 2016; Lali et al., 2014) and can require multiple iterations to assure validity and usefulness of the measurement protocol (Saint et al., 2018). As shown in this study, cross-validation across multiple data channels can itself enhance the reliability of the rules for detection of SRL processes and the validity of measurements. Another approach to improving the validity is by using think aloud data to validate and triangulate inferences drawn from the data channels analysed in the current study (Fan et al., 2022).

Another implication from this research is that the design of learning environments and tasks directly affects how we measure the process of SRL. Existing research has shown that SRL is inherently contextual, and the specific features or tools of a learning environment can influence if and to what extent learners engage in SRL processes (Siadaty et al., 2016c; Winne & Perry, 2000; Boekaerts & Cascallar, 2006; van der Graaf et al., 2021). For example, in the current study, the note taking and timer tools were designed to be always visible in the learning environment. However, we could only use eye-tracking data to detect if the learners looked at notes and the timer, which are both indicative of monitoring and evaluation processes of SRL. Therefore, researchers and practitioners involved in the design of learning environments that promote development of SRL skills, not only should consider pedagogical intent behind tasks for learners, but they should also carefully analyse the availability and suitability of different data channels to track relevant SRL actions and processes.

In conclusion, the results of the study showed that the addition of new data channels to commonly used navigational log data can increase detection of theoretically-meaningful actions and processes of SRL while improving the granularity of measurement. The results also demonstrated improvements in the modelling of SRL processes with the use of multi-channel data in comparison to the SRL process maps obtained with the use of the navigational log data only. However, we should stress that adding new data channels may also create new challenges, such as the reliability issues we encountered when using peripheral data in the current study. In general, multi-channel data which combines data channels such as navigational log, peripheral data and eye-tracking data is proven as valuable for detection and measurement of SRL processes and should receive more attention in future research on SRL.

Limitations and future works

A limitation of this study is that it only included fixation data of 25 participants. Other eye-tracking data (e.g., gaze pattern or saccade) from a larger sample size of participants should be collected and analysed in the future to overcome this limitation. Another limitation is that we used a relative simple learning environment in this study, which only contained two instrumentation tools (i.e., note and timer). Therefore, one of important directions for future research is to study the value of the use of more instrumentation tools such as those supporting highlighting, information searching or planning. The use of such tools can increase granularity of navigational log data. Another valuable research direction is to examine the extent to which the use of think aloud data can be used to triangulate the findings obtained with the three data channels used in the current study.