In the following, we describe a formal approach as a theoretical layer intended to bridge the gap between (arbitrary) log events that are specific for a technical implementation of an assessment platform and derived indicators related to substantive research questions of log data analysis. For that purpose, we use the well-known idea of finite-state machines or finite state automata, which represent a formal mathematical model of computation, and apply the concept of abstract machines to the interaction between test taker and assessment platform. The additional theoretical layer is related to the taxonomy presented in the previous part, as it allows to define states based on log events and to aggregate information about states to indicators (see Fig. 2).
Decomposing processes using finite-state machines
Finite-state machines (FSMs, e.g., Alagar and Periyasamy 2011) are already used in assessments to program and create complex, interactive instruments (see, e.g., Roelke 2012; Neubert et al. 2015), and are well known as a technique for software development, used, for instance, in game-based assessments (Mislevy et al. 2016). In the following, we use FSMs as a tool to analyze log data retrospectively, that is, after the assessment is finished. Note that using FSMs as an analysis tool means that neither the assessment platform (i.e., the software used to administer the computer-based test or questionnaire) strained with additional load nor is it necessary to know the specific FSM before or during the data collection. Analyses using FSMs have been rarely applied to log data, for instance, Almond et al. (2012) used FSMs to classify log entries, but focused on processing of keystrokes in a writing assessment. Bergner et al. (2014) analyzed data from a complex computer-based task of engineering literacy assessment using state sequences as modeled with the R package TraMineR (Gabadinho et al. 2011) without explicitly linking the investigated states to the log events.
The proposed framework is a new approach to contextualize the information provided as log events in states. The framework generalizes the retrospective analysis of Almond et al. (2012) beyond the classification of typing events and elaborates the retrospective reconstruction of state sequences as a versatile and generic tool for the analysis of log data using different FSMs to extract specific indicators depending on the respective research question.
States and set of states
As described, states designate specific sections of the process conceptualized as an interplay between test taker and assessment platform and require a formal definition (Almond et al. 2012). The meaning of states is constituted by three components:
-
a)
the information that is presented by the assessment platform in a designated phase of the assessment (e.g., the texts, images, videos etc. shown on the screen),
-
b)
the possibilities to interact with the content offered by the platform in a that specific phase (e.g., the interactive components such as buttons and input elements), and
-
c)
a justifiable theoretical interpretation of the meaning of this particular period of an assessment (e.g., expected cognitive processes that are relevant for the state with respect to the interplay between test taker and assessment platform).
In this sense, states provide the semantics for a theory-based analysis of log data. Describing and defining states that are distinguished for creating indicators is seen as the cornerstone in log data analysis. Likewise, for the computation of simple descriptive statistics or the application of complex psychometric methods, such as process mining (e.g., Romero 2011; Ferreira 2017), a terminology is needed that relates gathered log events to meaningful parts of the assessment process. States are not more than labeled eggs—similar to constructs in latent variable models (Nachtigall et al. 2003)—until a proper description is provided and evidence is gathered that support the hypothesized meaning of the processes related to this state. It is important to emphasize that the meaning of states is not created bottom up from information and provided interactivity of the assessment platform (a and b), but rather from the top down description of the states (c) related to the assessment framework (see previous section). Accordingly, the concept of states in this framework goes beyond the use of FSMs as an algorithmic tool for implementing or modeling complex systems.
States are thought as the filtered data that encode information that is used for the computation of indicators (with respect to Luecht and Clauser 2002, see above) by contextualizing events. Hence, the same log event can be understood differently in the FSM approach, depending on the current state.
No empirical data are required for defining the set of states S that are used to compute indicators for empirical applications. States can be defined a priori, and this even should be done based on an assessment framework to make sure that the assessment system finally provides all log events needed for identifying those states. For the a priori definition of states, no knowledge about the assessment platform is needed, so that the definition of states is not expected to be specific for a chosen implementation for a computer-based item (top down). However, states can also be defined or changed afterwards, for instance, to analyze existing log data (as we will show in the empirical application). The framework described in this paper can be used for any log data, as the reconstruction of the sequence of states is performed as a first step in the analyses of log data. From this follows that indicators computed from states are expected to be comparable between different assessments when defined with respect to identical states, while indicators directly computed from the log events are prone to contain platform-specific characteristics.
To reconstruct the sequence of states using available log data from a completed assessment, stored events are required to differentiate between states. Accordingly, not each set of states S can be analyzed using platform-specific log events Σ from every platform. However, as said, the definition of states is intended to be independent from the so-called input alphabet \(\sum\) (i.e., all log events provided by an assessment platform). The considered states should be motivated and described theoretically and not narrowed to the specific characteristics of log events available from an assessment platform.
Definition of a FSM
A FSM \(M\left\langle {S, s_{0} , \sum ,\delta , F} \right\rangle\) is defined by a finite number of states (i.e., a finite, non-empty set of states \(S\)), deterministic transitions between states (i.e., we use deterministic state machines) and triggers that provoke a particular transition from one state to another. FSMs starts with an initial state (s0) and are only in one state at a time (current state). The set Σ is the input alphabet, and δ represents the state-transition function (i.e., a definition of the possible transitions between states). The FSM is expected to end in an accept state out of a set of final states \(F\) (a subset of \(S\)), when the stream of all input elements (i.e., the list of log events of type x ∈ Σ for a test taker) has been processed successfully, event by event.
Transitions between states
Transitions are either triggered by internal events (such as timers) or external events (such as button clicks). Log events processed by the FSM as the input alphabet \(\sum\), can be used to identify transitions and thereby states. Transitions are represented in the formal description of a state machine by a state-transition function \(\delta\). This function is typically called partial state-transition function \(\delta \left( {q,x} \right) \to q^{\prime}\), because it only defines state transitions between states \(q\), q′ and selected triggers \(x \in \sum\). The state-transition function returns, for a current state q ∈ S of M, the new state \(q^{\prime} \in S\), when trigger x ∈ Σ occurs. Consequently, the transition triggered by a log event x ∈ Σ depends on the current state \(q\). Especially, this property makes state machines a valuable tool for log data analysis, as it contextualizes the meaning of log events x (e.g., pressing the back button of a web browser) with respect to the FSMs’ current state q ∈ S (e.g., the current page).
FSMs can be visualized by directed graphs, typically called state diagrams. States are represented by circles and transitions are represented by arrows. For the analysis of log data, the arrows are linked to the triggers (e.g., log events) that are used to identify the transitions.
Extensions (guards, variables and look-ahead)
For practical applications of FMS’s, the trigger used in the state-transition function does not necessarily only refer to a specific event type x ∈ Σ but also to additional information specific for particular event types. Such properties of events denoted as (e), for instance, the specific question an answer-change event belongs to, can be used to formulate conditions (guards) that must be fulfilled, that M accepts an event x ∈ Σ in state q ∈ S. Moreover, extensions with respect to variables (known as extended state machines) can be used to identify state transitions with sparse log data. Finally, when log data are analyzed retrospectively with FSMs, guards that inspect not only the current log event, but incorporate all (or all remaining) events for an individual test taker (look-ahead) can be used practically (see the Table 3, below, for examples, i.e., is_last_event, nearest_event_is).Footnote 3
Computing indicators using state machines
Using FSMs allows defining indicators with respect to the set of states S by combining theoretical input with empirical input [log events x ∈ Σ with additional event-data (e) and timestamps t] and knowledge about the platform and the implementation of computer-based tasks (see Fig. 3). This can be conducted for the test takers separately, each time starting with the state s0, and it is expected that the FSMs for each test taker reach one of the end states f ∈ F.
Reconstructed sequence of states
Using FMSs allows reconstructing how a test taker followed through the sequence of states q ∈ S, distinguished in a particular FSM. Based on the identified states various indicators can be computed. To include time into the FSM approach, timestamps that are provided with the log events can be used.
Thus, the FSM approach disentangles processing log events (this is done using the FSM) and the computation of indicators (this is done using the reconstructed sequences of states). For empirical log data analysis, this offers the possibility to include paradata comprising multiple events in a coherent way. More specifically, it fosters the separation of steps required to parse and read the log data (i.e., the empirical input) from the steps used to extract meaningful indicators (Heerwegh 2003).
Augmented log data
The list of tuples \(\left\langle {i,t,x, \left( e \right)} \right\rangle\), that represent the empirical input for test taker i, is augmented with additional information from the reconstructed sequences using a FSM M as follows: (1) the state q ∈ S of the FSM before an input element x ∈ Σ was processed (starting with s0 for the first tuple), (2) the current state q′ ∈ S of the machine after a x ∈ Σ was processed and (3) the relative time difference td to the previous log event (starting with zero for the first tuple in the list).Footnote 4 Each tuple \(\left\langle {x,t,i, \left( e \right)} \right\rangle\) in this list represents a log event of type \(x \in \sum\) from the input alphabet that occurred at time t and belongs to a test taker \(i = 1 \ldots I\). To reconstruct the sequence of states, the list of tuples is processed event by event and augmented with \(q,q^{\prime}, {\text{td}},\)Footnote 5 starting with s0 for each test taker (see Table 3, below, for an example).
In general, indicators derived from log data using FSMs can be formulated as different aggregates of the reconstructed sequences of states in the augmented log data. Discussing and elaborating all possible ways to compute indicators is beyond the scope of this paper. Instead, in the following, we describe three outputs of the FSM approach that provide the source for different types of indicators: the sequence of states, the state summary table, and the state transition table.
Sequence of states and sub-sequences
Concatenating all states of a test taker allows to extract the sequence of states, which can represent an indicator itself when states are focused that can occur in different meaningful orders, for instance, to identify problem-solving strategies (e.g., Tóth et al. 2014). The sequence of states can also be used to cluster test takers with respect to sequences (e.g., using edit distance as in Hao et al. 2015). Beyond the complete sequence, subsequences of a specific length can be counted automatically as output of the FSM approach, for instance, ordered as n-grams (e.g., He and von Davier 2016).
State summary table
A state summary table can be created for each test taker from the augmented log data, containing all defined states, the frequency how often states were visited, the total time on each state and additional measures for each state, such as the time of the shortest and longest visit. Values of binary indicators or count indicators, such as indicators for the relevant page visit, the request of source information and for tool use can be directly metered from the state summary table. For states that occurred at least once, the time on state can be used to compute values for further metric indicators, such as the time on task (Goldhammer et al. 2014), the reading time (e.g., Richter and Naumann 2000) and the edit time (Almond et al. 2012).
State transition table
Summarizing the augmented log data for each test taker with respect to rows that contain different values in the state before (q) and the state after (q′) allows to create a state transition table, that counts the frequency of the directed transitions from one state to another. From the state transition table indicators that refer to the transition between states, e.g., the frequency of backward navigation from questions to the stimulus, can be extracted. Moreover, the state transition table can be used to create an aggregated representation of the navigation between states, such as an adjacency matrix.