The Impact of Inserting an Additional Mental Process
Pure insertion describes a scenario where a mental process is inserted within a sequence of other processes without altering the other processes. Under the assumption of pure insertion, the duration of the inserted process can be identified by calculating the difference in overall response times when the process is present versus absent (i.e., Donder’s subtraction method). Additionally, under the assumption of pure insertion, brain regions associated with the inserted process can be identified in fMRI studies by contrasting activation when the process is present versus absent. However, the assumption of pure insertion does not hold in many situations. In this study, we adopted a novel approach for identifying the impact of insertion by decomposing the EEG signal into a sequence of latent stages, each with a distinct topographical distribution and duration. Based on these latent stages, it is possible to identify when, and for how long, a process occurred. We crossed two factors in the experiment: whether the trial required substituting a letter with a number from memory and whether the trial required calculating the product of two numbers. By crossing these factors, we could examine whether inserting substitution and calculation processes affected the durations of other mental processing stages. Behavioral data in the form of response latencies, and averaged EEG signal in the form of event-related potentials (ERPs), provided no evidence of violations of pure insertion. However, our analysis of single-trial EEG signal allowed us both to show that inserting substitution or calculation did affect other stages and to understand why.
KeywordsPure insertion The subtraction method Electroencephalography Hidden semi-Markov models Multivariate pattern analysis
Psychological research has long sought to characterize the mental processes that give rise to observable behavior. Researchers examined the sequences and durations of information processing stages responsible for converting perceptual inputs to decisions and actions (Sternberg 1969; Townsend and Ashby 1983). One important tool in estimating the duration of a specific mental process is the subtraction method, first proposed by Donders (1969). This method applies to tasks that differ only in whether an extra cognitive process is inserted. Then it estimates the duration of the extra process by subtracting the response time (i.e., RT) in the simpler task from the RT in the more complex task (Donders 1969). Applying the subtraction method assumes that pure insertion holds: that is, a mental process can be inserted into a stream of processes without affecting other processes (also referred to as the assumption of common-process invariance, where the processes common to each pair of conditions remains constant as the extra process is inserted). However, whether and when the assumption of pure insertion holds has been a long-standing question (Pachella 1973; Massaro 1989; Friston et al. 1996; Logothetis 2008). In this work, we combine neuroimaging methods (electroencephalography or EEG) and statistical methods (hidden semi-Markov models and multi-voxel pattern analysis) to shed new light on the assumption of pure insertion.
Subtraction Method and Its Assumption of Pure Insertion
Donders introduced the subtraction method as a way to estimate the duration of a specific mental process by subtracting the RT in the simpler task from the RT in the more complex task (Donders 1969). The specific tasks that Donders considered were a simple RT task, a go/nogo task (a simple RT task with an additional stimulus discrimination process), and a choice RT task (a go/nogo task with an additional response selection process). The duration of stimulus discrimination was estimated as the difference between RTs in the go/nogo task and the simple RT task, and the duration of response selection was estimated as the difference between RTs in the choice RT task and the go/nogo task (for other examples, see Luce and Green 1972; Treisman and Souther 1985). A variant of the subtraction method, often referred as cognitive subtraction, has also been used in functional magnetic resonance imaging (fMRI) and positron emission tomography (PET) studies (Sartori and Umiltà 2000). In these studies, assuming that the inserted cognitive process does not affect the brain activity of the existing cognitive processes, the activity associated with the inserted cognitive process is estimated based on the difference between brain activity in the experimental conditions with the process versus control conditions without the process (Logothetis 2008).
Previous Approaches to Testing Pure Insertion
At least three different sets of methods have been developed to test the validity of pure insertion. The first set of methods utilized RT (Taylor 1966; Ashby 1982; Ashby and Townsend 1980; Ilan and Miller 1994). One early study by Taylor (1966) crossed two factors: stimulus discrimination and response choice. If pure insertion holds, neither of these factors should affect the durations of other cognitive processes required to complete the task. Therefore, one would expect that the contributions of the two factors to overall RT would be additive, as was found. Relatedly, Ashby (1982) used RT information to test the hypothesis that conditions with extra processes would have longer RTs than conditions without the processes. The ordering of RTs across conditions should hold both at the level of mean RTs and of cumulative RT distributions, which was found. Lastly, Ashby and Townsend (1980) proposed a stronger test in which RTs were decomposed into hypothetical components, each with different temporal distributions. They applied the test to data from a memory scanning task where subjects had to judge whether a single item was part of a previously presented set. The results supported the notion that including an extra item in the memory set added a mental processing stage without altering other stages’ durations.
The second set of proposed methods for testing pure insertion use behavioral measurements besides RT to infer underlying cognitive processes. Ulrich et al. (1999) measured the magnitude and time course of response force to detect subtler changes in motor planning and execution. They gathered data in the same set of tasks first described in Donders’ experiment: a simple RT task, a go/nogo task, and a choice RT task. Responses were more forceful in the go/nogo task than in the simple RT task. This was contrary to the expectation, based on the pure insertion hypothesis, that response execution would be consistent across these two tasks.
The final proposed set of methods for testing pure insertion use neuroimaging techniques (Vidal et al. 2011; Miller and Low 2001; Danek and Mordkoff 2011; Friston et al. 1996; Smid et al. 2000). For example, by examining event-related potentials (ERPs) over the primary and supplementary motor areas, Vidal et al. (2011) concluded that the same motor commands, issued in two different conditions, were affected by the insertion of an earlier stage. This outcome violates the pure insertion hypothesis. Relatedly, Miller and Low (2001) measured motor execution time as the lag between the onset of lateralized readiness potential (LRP) and the production of the required response. They found equal motor execution time across a simple RT task, a go/nogo task, and a choice RT task, which supports the assumption of pure insertion. However, under a similar experimental design with the same approach, Danek and Mordkoff (2011) found differences in motor execution time across a simple RT task, a go/nogo task, and a choice RT task. Outside the EEG research, Friston et al. (1996) conducted a PET study in which two cognitive components, object recognition and phonological retrieval, were inserted factorially. Activation in the inferotemporal region reflected an interaction between the two factors, which violates the assumption of pure insertion. While informative, measures of blood flow and oxygenation typically lack the temporal resolution needed to test the original notion of pure insertion, which concerns changes in the durations of mental processes.
Motivation and Overview of Current Experiment
The field has developed a rich set of tools for using latency patterns across conditions to test the assumption of pure insertion. Nonetheless, these tools all test necessary conditions for pure insertion to hold. Even if the latency patterns perfectly satisfy what would be expected if a new stage was simply added to the processing stream, it is possible that some of the additional time was due to changes in other stages. The only way to assure that the other stages have not changed is to measure them directly, which is the goal of the current experiment.
Psychophysiological measures such as the EEG methodology afford millisecond resolution, making it an attractive tool for tracking the time course of cognitive processes in a task (Coles 1989; Düzel et al. 1997; Woodman 2010). Central to the EEG analysis is the use of the event-related potentials (ERPs), which are voltage deflections over the scalp that are synchronized to observable events like stimulus presentation and response commission. Given the amount of noise in the EEG signal, traditional ERPs are obtained after averaging signals across multiple trials locked to the observable events. Traditional ERPs can be used to detect onsets of important mental processes and reveal differences in recorded brain activity across different experimental conditions over selected regions (i.e., electrodes) and time windows. However, the signal averaging process can distort ERP components given their trial-to-trial variability. Furthermore, the signal averaging process can fail to detect ERP components related to mental process that are weakly synchronized to observable events (Luck et al. 2000).
To model the trial-to-trial variability of ERPs, we have developed an approach (referred as HSMM-MVPA, combining hidden semi-Markov models with multivariate pattern analysis; Anderson et al. 2016) that allows us to identify ERP-like components as distinctive profiles of scalp activity (i.e., bumps) at variable latencies in each trial. We can estimate the number of bumps using a model selection procedure and identify the durations of the stages bounded by these bumps. A bump is modeled as a half-sine multidimensional peak across the scalp that signifies a significant change in information processing, followed by a flat period where the mean of the task-related, ongoing sinusoidal noise is assumed to be 0. This assumption is inspired by two theories of ERP generation (Yeung et al. 2004; Makeig 2002; Shah et al. 2004; Basar 1980). According to the classical theory, significant cognitive events generate bursts of activity in discrete brain regions (Shah et al. 2004). Therefore, EEG signal can be described as a sum of sinusoidal peaks and ongoing neural signal of uncorrelated sinusoidal variation. Averaging neural signal across trials will reveal the peaks as averaged ERP waveforms that we see, whereas the other sinusoidal variation will average to zero. According to the second theory of ERP generation, the synchronized oscillation theory, significant cognitive events reset the phase of the oscillation at a certain frequency (Basar 1980). After averaging across trials, the resulting ERP waveforms are indistinguishable to those generated under the classic theory under simulated datasets (Yeung et al. 2007). Other methods exist for dealing with trial-by-trial variability in the latency of ERP components (Ouyang et al. 2015; Woody 1967; Cecotti and Ries 2017; Dmochowski and Norcia 2015; Mestre et al. 2014). The primary advantages of HSMM-MVPA as compared to these other methods are its capability (i) to capture ERP-like components that are not locked to observable events and (ii) to estimate the number of ERP-like components in the task (for a review and comparison of methods, see Walsh et al. 2017).
Originally developed as a method to uncover the sequence of information processing stages and their durations in a task (Anderson et al. 2016), we propose in the current paper that HSMM-MVPA can also be applied as a novel way to test the assumption of pure insertion. Once EEG signal is decomposed into a sequence of latent stages interleaved with distinct bumps, we can estimate the number, timings, and topographical distributions of each bump and durations of each stage. Having identified the mapping between stages and cognitive processes, one can then examine whether the stages shared between two conditions have the same durations, as required for testing pure insertion.
Testing the Assumption of Pure Insertion in an Arithmetic Task
Encoding: Encode two items on the screen
Substitution: Substitute the letter C with its corresponding digit 3 (previously memorized mapping)
Calculation: Calculate the product of the resulting two digits 3 and 5
Mapping: Prepare the motor response by mapping the two-digit answer 15 to two fingers
Response: Execute the motor response.
Four experimental conditions and their corresponding examples (for C = 3)
Null (3 5 → 3 5)
Calc (3 5 → 1 5)
Sub (C 5 → 3 5)
Both (C 5 → 1 5)
The pure insertion hypothesis holds that adding the requirement to perform substitution or mental arithmetic should increase overall RTs without affecting the durations of any of the preceding or following mental processes. The goal of the current work is to demonstrate the novel application of the HSMM-MVPA method to test pure insertion. The chosen task has two desirable properties to demonstrate the strengths of HSMM-MVPA in comparison with methods based on overall RT or ERPs:
First, the processes involved in this task have been extensively studied in neuroimaging experiments and are known to evoke a characteristic set of ERPs. Specifically, the retrieval of arithmetic facts is associated with the appearance of the late positive complex (LPC) over the posterior scalp (Kiefer and Dehaene 1997; Iguchi and Hashimoto 2000). Declarative retrieval, in turn, is associated with the parietal old/new effect, also over the posterior scalp (Curran 2000; Düzel et al. 1997). Using the same logic as Friston et al. (1996), we can compare ERPs across four conditions and examine if two cognitive factors are inserted in an additive manner without interaction. If that does not hold, it would violate the assumption of pure insertion at the level of brain activation.
While inserting of substitution is expected to extend the duration of the mapping stage, we expect that we will not see in overall RT any counter-indications to the assumption of pure insertion. This is because inserting a substitution process would prolong the duration of the mapping stage to the same extent in the two conditions with a substitution process (i.e., Sub and Both). Therefore, insertion a substitution process and insertion a calculation process will be additive and not interact on the level of overall RT.
We have described our predictions of whether pure insertion holds in the given tasks. These predictions will be compared with the results of using the HSMM-MVPA method to test for pure insertion. The HSMM-MVPA method is applied independently, without knowledge of the model predictions. Testing for pure insertion using HSMM-MVPA included three steps. First, we examined whether HSMM-MVPA recovered the same set of cognitive stages as postulated in the process model, including the stages expected to be shared across conditions. Second, we estimated the durations of the recovered stages for each condition. Third, we compared the durations of stages common across conditions to determine whether inserting substitution and calculation altered their durations. This constitutes one of the most direct tests of pure insertion performed to date. We will compare conclusions reached when applying HSMM-MVPA with methods that are based on overall RT or ERPs and discuss the implications of their differences.
A total of 22 individuals from the Carnegie Mellon University community participated in a single session that lasted for two and a half hours for monetary compensation. All were right-handed. None reported a history of neurological impairment. One subject did not complete the experiment, and another did not comply with the experimental instructions. Data from the remaining 20 subjects were analyzed.
Participants memorized four letter-to-digit mappings during the study phase of the experiment. The four letters were randomly selected from the 26 letters of the alphabet for each participant. The four digits were always “2,” “3,” “4,” and “5.” Participants were trained to responding using the five fingers of the right hand. Each finger corresponded to one of the five digits (i.e., “1,” “2,” “3,” “4,” and “5”).
Upon completing the study phase, participants advanced to the test phase of the experiment. The test phase contained four conditions shown in Table 1: (1) Null—two digits appeared on the screen and participants typed the two digits; (2) Calc—two digits appeared on the screen and participants typed the product of the two digits; (3) Sub—a letter and a digit appeared on the screen and participants retrieved the digit that corresponded to the letter and typed the two digits; (4) Both—a letter and a digit appeared on the screen and participants retrieved the digit that corresponded to the letter and typed the product of the two digits. Each condition contained 16 unique pairs.1 Pairs were balanced such that each of the four letters occurred with equal frequency.
The study phase of the experiment had two purposes. First, it allowed participants to memorize the set of letter-digit pairs used in conditions of the later test phase that involved substitution. Second, it allowed participants to master the mappings between numerical values and response keys. At the start of the study phase, participants memorized four letter-to-digit mappings. The four letters were randomly selected for each participant from the 26 letters of the alphabet, and the four digits they mapped to were always “2,” “3,” “4,” and “5.” During their first three presentations, letter-digit pairs appeared at the center of the screen for 8000 ms. Participants were instructed to read and memorize the pairs.
After all pairs appeared three times each, participants were tested on their memories for the pairs. During each trial, a letter appeared and participants were instructed to recall and type the corresponding digit. They responded using their right hand, with each finger corresponding to one of the five digits respectively (i.e., “1,” “2,” “3,” “4,” and “5”). They were given unlimited time to respond. If the participant responded incorrectly, the correct answer was displayed for 1000 ms, and if they responded correctly, the word CORRECT appeared for 1000 ms. We employed a triple dropout procedure. When participants responded correctly, the letter-digit pair was removed from the list of pairs. Otherwise, the letter-digit pair was added to the end of the list. The memory portion of the study phase ended once the participant completed the list a total of three times (i.e., they responded correctly to each item a total of three times).
In the final portion of the study phase, participants practiced pressing sequences of keys using their right hand. In each trial, a pair of non-repeated digits, randomly drawn from the set (1, 2, 3, 4, 5), appeared on the screen. Participants were instructed to (1) press the corresponding keys in the order of the digits on the screen and as quickly as possible and (2) to minimize the inter-key time. Participants were informed that training would end once satisfactory performance was reached, but they were not told the exact termination criteria. Training ended once the participant performed ten correct responses in a row with each inter-key interval falling below 400 ms. Training emphasized the accuracy of key presses, along with the need to prepare and execute key presses in quick succession, but allowed individuals to take time before the first key press to prepare the motor response.
The test phase consisted of seven cycles, each of which contained four blocks corresponding to the experimental conditions (Table 1). Condition order within a cycle was randomized. Each block contained 32 trials. At the start of each block, a prompt denoting the experimental condition (Null, Calc, Sub, or Both) appeared. At the start of each trial, a centrally presented fixation cross appeared for a variable duration (sampled uniformly from 400 to 600 ms). Then, depending on the experimental condition, two digits, or a letter and a digit, appeared at the center of the screen. Participants responded by pressing two keys. The correct answer was always a pair of non-repeated digits ranging from “1” to “5.”
Participants were instructed to respond as quickly and accurately as possible, while also minimizing the inter-key time. To incentivize participants to minimize inter-key time, we created a bonus point system that awarded correct responses only, while penalizing time to press the first key and inter-key time. Specifically, 50 points were deducted for incorrect responses, and (50-25*First_Key_Press-50*Inter_Key_Press) points were awarded for correct responses. Points from each trial were added to a running total that accumulated within each block. Time was measured in seconds. Inter-key time was penalized twice as much as time to initiate the first key press. This was to encourage participants to prepare motor responses for both key presses in advance, rather than pressing one key before or while preparing the second response. After the participant responded, feedback appeared on the screen for 1000 ms. Following incorrect responses, the word INCORRECT appeared along with the correct response and the number of points deducted. Following correct responses, the word CORRECT appeared along with the number of points awarded. Points earned were displayed at the end of each block. Participants were shown and received zero points if the score was negative.
Stimuli appeared on a 60-Hz LCD monitor set 60 cm from participants. The EEG was recorded from 128 Ag-AgCl sintered electrodes (10–20 system) using a Biosemi Active II System (BioSemi, Amsterdam, Netherlands). The EEG was re-referenced online to the combined common mode sense (CMS) and driven right leg (DRL) circuit. Electrodes were also placed on the right and left mastoids. Scalp recordings were algebraically re-referenced offline to the average of the right and left mastoids. The EEG and EOG signals were filtered with a bandpass of 0.1 to 70.0 Hz and were digitized at 512 Hz. The EEG recording was decomposed into independent components using the EEGLAB FastICA algorithm (Delorme and Makeig 2004). Components associated with eye blinks were automatically identified and projected out of the EEG recording. Epochs (from stimulus to response onset in each trial) were then extracted from the continuous recording and corrected over a 100-ms prestimulus interval. Epochs containing voltages above + 100 μV or below − 100 μV were excluded.
In the analysis of event-related potentials (ERPs), we examined data from nine regions centered on electrodes F3, FZ, F4, C3, CZ, C4, P3, PZ, and P4. Each region contains seven electrodes.2 Collectively, these regions comprised three levels for the factors of laterality (left, mid, and right) and frontoparietal position (frontal, central, parietal). We analyzed data from these regions from 400 to 820 ms, the time window used by Iguchi and Hashimoto (2000) to measure the LPC in an arithmetic production task (also see Kiefer and Dehaene 1997). The experiment contained four conditions (Null, Calc, Sub, and Both), which were defined by the crossing of two factors: whether or not problems require calculation (Calc and Both versus Null and Sub) and whether or not problems required substitution (Sub and Both versus Calc and Null). To isolate these factors’ effects, we performed a 2 (Calculation versus No Calculation) by 2 (Substitution versus No Substitution) by 3 (left, midline, right) by 3 (frontal, central, parietal) repeated measures ANOVAs. For all analyses involving a factor with more than two levels, we adjusted p values using the Greenhouse-Geisser correction.
HSMM-MVPA Applied to EEG
In our HSMM, we explicitly model the variability of endogenous ERP components that would otherwise be distorted or lost in the average waveforms. Previous applications of the HSMM-MVPA method to EEG data were effective in recovering the durations of the underlying processing stages (e.g., recollection, decision) and showed predictable changes with experimental factors (Anderson et al. 2016; Walsh et al. 2017; Zhang et al. 2017, 2018). The HSMM-MVPA method identifies brief, distinctive profiles of scalp activity (i.e., bumps) at variable latencies in each trial (Anderson et al. 2016). A bump is modeled as a half-sine multidimensional peak across the scalp that signifies a significant change in information processing, followed by a flat period where the mean of the task-related, ongoing sinusoidal noise is assumed to be 0. Our HSMM models the durations of the flats as gamma distributions.
Prior to HSMM-MVPA, two steps of dimensionality reduction were carried out to simplify the analysis and to make the computations more efficient and tractable. First, the data were down-sampled to 100 Hz (i.e., 10-ms samples). Second, to deal with the highly inter-correlated nature of the EEG signal at the 128 sensors and to reduce the dimensionality of the signal, spatial PCA (i.e., across electrodes) on all trials was performed to generate orthogonal PCA dimensions. The first 10 PCA components were retained. These accounted for 85.1% of the variance in the signal. The PCA components were then z scored for each trial. As a result, the data for the analysis consisted of 10 orthogonal PCA components sampled every 10 ms and with constant mean and variability across trials. The data for the analysis included the period of time from stimulus onset to the first key response in each trial. Ten samples (100 ms) beyond the response were also included from each trial in the analysis to ensure that the bump reflecting the motor response was fully captured. We only considered data from correct trials that were longer than 400 ms and shorter than 3000 ms. Trials from the first block of each condition (i.e., the first four blocks of the experiment) were treated as training trials and were thus excluded from further analysis.
An n bump HSMM requires estimating n + 1 stage distributions to describe the durations of the flats plus the n 5-sample bumps for each PCA component. The best model fit of such HSMMs is given by maximizing the summed log likelihood of the bumps and flats across all trials. For each trial, this log likelihood can be decomposed into two parts: the likelihood of the EEG data given that the bumps are centered at each time point and the likelihoods that the bumps are centered at those time points given the gamma distributions that constrain their locations. In other words, the HSMM must select bump locations within a trial to maximize the correspondence between the observed and the estimated EEG signal, while selecting relatively consistent flat durations across trials to maximize their fit to the gamma distributions. The estimation process has to consider all possible combinations of bump locations and this is what is efficiently calculated by the dynamic programming associated with hidden semi-Markov models (Yu 2010). We follow closely the model selection and estimation procedure described in Anderson et al. (2016). In the current study, there are additional constraints in bump locations in order to fit an HSMM-MVPA to four conditions simultaneously.3 The HSMM methods also return the probabilities of each bump occurring at each time point on a trial-by-trial basis. These probabilities can be used to calculate the location of each bump in a trial, which is the sum of the time points in the trial multiplied by the corresponding probabilities that the bump occurred at those times. Mean stage durations for a particular subject can then be calculated as the average time between bumps across all trials within that subject.
We performed a repeated measures analysis of variance (ANOVA) with calculation and substitution as factors. Accuracy was higher in blocks that included calculation, F(1, 19) = 6.35, p < .05, and lower in blocks that included substitution, F(1, 19) = 21.98, p < .001. There was no interaction between calculation and substitution, F(2, 36) = .04, n.s.
Mean accuracy and RT with SEMs in parenthesis
Response time (s)
The main effect of calculation was not significant, F(1,19) = 0.656, n.s. However, this was qualified by a significant two-way interaction between laterality and calculation, F(2,38) = 4.794, p < .05, and a three-way interaction between laterality, frontoparietal location, and calculation, F(4,76) = 12.535, p < .001. As with substitution, the effect of calculation was greatest over the left central and parietal scalp. The two-way interaction between calculation and substitution was not significant, F(1,19) = 1.118, n.s., and nor was any higher-level interaction involving these factors. In summary, the requirements to perform substitutions and calculations contributed additively to mean voltages over a left-lateralized, central-parietal region, consistent with the assumption of pure insertion.
Identifying the Stage Durations and the Bump Profiles in HSMM-MVPA
We first identified the stages shared across all four conditions by carrying out model selection in the Null condition. We performed leave-one-subject-out cross-validation (LOOCV) by fitting an HSMM to even-trial data from all but one subject and then using the HSMM to estimate the likelihood of the remaining subject’s odd-trial data.4 More bumps are preferred only when there is an improvement in model likelihood in a significant number of subjects as determined by a two-tailed sign test. We confirmed that an HSMM with three bumps outperformed all models with fewer bumps (19 out of 20 subjects for the 3-bump versus a 1-bump model, p < 0.0001; 17 out of 20 subjects for the 3-bump versus a 2-bump model, p < 0.01) and that an HSMM with four bumps did not significantly outperform the 3-bump model (12 out of 20 subjects for the 4-bump versus a 3-bump model, p > 0.05). Therefore, we continued our analysis with the 3-bump model.
The four stages identified in the Null condition are elementary processes that should be shared across all conditions of the experiment. To recover other processes, we applied an HSMM-MVPA to the four conditions simultaneously. In addition to the stages shared across all four conditions (Pre-attention, Encoding, Mapping, Response), we inserted Calculation and Substitution stages depending on the condition. This full model was superior to two alterative models that do not include Calculation or Substitution stages.5 Figure 6b shows the durations of the six stages and the scalp topologies of the five bumps in the Both condition, which included substitution and calculation. Compared with the 3-bump model obtained previously, two additional bumps and stages are inserted. The posterior positivity in these two bumps is consistent with the two inserted stages relating to retrieval of variables and of multiplication facts. The positivity in the case of Substitution is consistent with the parietal old/new effect, a positivity that accompanies retrieval of information from memory (Curran 2000; Düzel et al. 1997). The positivity in the case of Calculation, in turn, is consistent with the LPC, a positivity associated with the retrieval of arithmetic facts (Kiefer and Dehaene 1997; Pauli et al. 2004, 2006). Though estimated in separate HSMMs, there are striking similarities in the bump profiles and stage durations in the processes shared between Fig. 6a, b.
Testing Pure Insertion by Comparing Stage Durations Across Conditions
The HSMM-MVPA that was applied to all four conditions provides estimates of bump latencies on a trial-by-trial basis. From these, it is possible to estimate mean stage durations for each condition and subject. These estimates provide direct evidence about whether the durations of mental processes shared across conditions are identical despite the addition of other mental processes (substitution and calculation). Pure insertion predicts that the durations of the same stages will be identical across different conditions.
We then estimated gamma distributions separately for conditions that involved calculation versus conditions that did not. Including condition-specific durations in the Response stage significantly increased the likelihood of the data (16 out of 20 subjects; p = .01), while including condition-specific durations in the rest of the stages did not (p > .05). This indicates that the duration of the Response stage is altered when the calculation process is inserted, which also violates the assumption of pure insertion.
Pure insertion is the idea that a cognitive process can be added to a sequence of processes without affecting their durations or brain activation. This is often a tacit assumption that accompanies use of the subtraction method. Yet the assumption of pure insertion does not hold across all tasks and conditions. We conducted an EEG experiment with four conditions formed by crossing two factors, substitution and calculation. This allowed us to test whether the respective processes conformed with the assumption of pure insertion. Mean response times, the complete RT distributions and the ERP results were consistent with pure insertion. However, HSMM-MVPA provides an even more direct test of whether the durations of shared stages vary across conditions. In contrast to the RT and ERP results, HSMM-MVPA revealed that inserting a substitution process increased the duration of the response mapping stage, and inserting a calculation process decreased the duration of the response stage. We discuss each of these findings, specific to the current task, in turn.
Adding the substitution process increased the duration of the Mapping stage, violating the assumption of pure insertion. According to the ACT-R’s theory of declarative memory (Anderson 2007; Anderson et al. 1998; Schneider and Anderson 2011), the time to retrieve an item depends on its associative strength with the current experimental context. Associative strength is weaker when more items are associated with the same experimental context. In the Sub and Both conditions, participants had to retrieve letter-to-digit mappings and digit-to-finger mappings. The earlier retrieval in each trial may have interfered with the later retrieval, prolonging the Mapping stage. Although calculation also involves retrieval, arithmetic facts are not specific to the experimental context and should not, therefore, interfere with the retrieval of items that are. Consistent with this expectation, inserting the Calculation process did not affect the duration of the Mapping stage.
Inserting the calculation process did decrease the duration of the Response stage, however. This unexpected finding of the violation of pure insertion may not be the result of the calculation process per se, but rather of another aspect of conditions that involved calculation: the reduced number of distinct responses.6 In particular, the product of two numbers is invariant to their presentation order. Therefore, in conditions with calculation, the possible number of responses is half the number in conditions without calculation. Decreased duration in the Response stage is consistent with Hick’s law (1952), which holds that overall RT should be less when there are fewer response options. HSMM-MVPA localized this speed-up to the Response stage, consistent with an interpretation in terms of Hick’s law. Note that the insertion of calculation increases response time by the addition of a stage but also sped up the response stage. As this example demonstrates, the subtraction method, as applied to overall RTs, would underestimate the duration of the calculation stage because of the contrasting effect of the calculation manipulation on the Response stage.
In the current study, we demonstrated a novel application of HSMM-MVPA to test pure insertion. Though the conclusion of whether or not pure insertion holds depends largely on the tasks and experimental conditions examined, the method we proposed to test the assumption can be generalized to other studies. For example, the method could be applied to the original set of go/nogo tasks used to study pure insertion and to determine the exact stages where the assumption of pure insertion is violated.
Comparison with Methods Using Response Time and Event-Related Potentials
The factorial design of the experiment allowed us to test for interactions between the two factors, calculation and substitution, when examining the assumption of pure insertion. If pure insertion holds, the two factors should have additive effects on both RT and brain activity (Taylor 1966; Friston et al. 1996). In the current experiment, results from the RT and ERPs were consistent with pure insertion. Calculation and substitution had additive effects on mean RTs and the RT distributions. Likewise, calculation and substitution had additive effects on ERP amplitudes over the left central and parietal scalp, consistent with pure insertion.
Why did the RT and ERP evidence support the assumption of pure insertion, whereas the HSMM-MVPA analysis rejected pure insertion for both calculation and substitution? First of all, additivity in RTs or ERPs is necessary, but not sufficient to establish pure insertion. We can be confident that pure insertion is violated when additivity is violated. However, there are cases where additivity holds even without pure insertion: for example, when an inserted stage has an additive effect on other stages. This was the case in the current experiment were inserting the calculation process inadvertently altered the Response stage as well, and inserting the substitution process altered the Mapping stage as well. Secondly, the HSMM-MVPA method relies on more sources of information (i.e., both single-trial EEG data and RT) in contrast to ERP and RT methods that rely on only one source of information (i.e., averaged EEG data or RTs). Recording multiple, converging sources of evidence provides a more stringent test of a theory and may increase the likelihood of rejecting a hypothesis. This depends on having a theory-driven framework for integrating the multiple sources of data. The HSMM-MVPA method provides such a framework to utilize single-trial EEG data by modeling the scalp profiles and occurrence of ERP-like components across individual trials.
Implications in fMRI Studies and Beyond
The standard approach to testing pure insertion in fMRI studies is to examine interactions between factors in terms of activation in specific brain regions (Friston et al. 1996). Using this approach, the current experiment, if adopted for fMRI, might fail to produce evidence of a violation of pure insertion despite the HSMM-MVPA evidence to the contrary. Both the Substitution stage and the Mapping stage involve retrieving context-specific experimental facts (letter-to-digit mappings or digit-to-finger mappings). If the retrieval process in the two stages recruit the same set of brain regions (e.g., left prefrontal cortex; Anderson et al. 2007), application of the subtraction method would fail to separate activation related to the substitution process from activation related to the mapping process.
In addition to the subtraction method used in the literature of RT and fMRI studies, there are other measures closely related to the assumption of pure insertion. In particular, “selective influence” posits that it is possible to design an experimental manipulation that affects a designated cognitive process and no other processes (Sternberg 1969a, b). Selective influence is a core assumption of many methodologies, including systems factorial technology (SFT), which is used distinguish different types of information processing architectures (Townsend and Nozawa 1995). However, in the current work, the unexpected violation of pure insertion in calculation demonstrates that, while theoretically feasible, it may be difficult to implement selective influence in an experiment. The HSMM-MVPA method provides a powerful framework to directly test whether any other processes are altered while manipulating one process.
To summarize, the HSMM-MVPA method we used to test pure insertion provides information about the durations of each of a multitude of cognitive processes occurring within a task. This is in contrast with RTs, which only provide a measure of cumulative processing time. HSMM-MVPA is also extremely general and can be applied to tasks that involve a multitude of cognitive processes. Finally, HSMM-MVPA provides millisecond resolution about the timing of mental events. This makes HSMM-MVPA more suitable than other neuroimaging techniques such as fMRI and PET for examining pure insertion in tasks that occupy Newell’s cognitive band (1990). In the context of the current experiment, HSMM-MVPA revealed that substitution affects the durations of other ongoing processes that involve retrieval, consistent with ACT-R and other theories of associative memory.
Sixteen pairs of digits were used in each of the four experimental conditions. Pairs in conditions that involve calculation were (2,7), (7,2), (2,6), (6,2), (3,5), (5,3), (3,7), (7,3), (4,6), (6,4), (4,8), (8,4), (5,7), (7,5), (5,9), and (9,5). Pairs in conditions that do not involve calculation were (1,2), (2,1), (2,4), (4,2), (2,3), (3,2), (3,5), (5,3), (3,4), (4,3), (1,4), (4,1), (1,5), (5,1), (4,5), and (5,4). The underlined digits were replaced with corresponding letters in conditions that included substitution.
The electrodes included in each region were as follows: left-frontal (AFF5H, F7, F5, F3, FFT9H, FFT7H, FFC5H), mid-frontal (AFF1H, AFF2H, F1, FZ, F2, FFC1H, FFC2H), right-frontal (AFF6H, F4, F6, F8, FFC4H, FFC6H, FFT8H), left-central (FCC5H, FCC3H, C5, C3, C1, CCP5H, CCP3H), mid-central (FCC1H, FCC2H, C1, CZ, C2, CCP1H, CCP2H), right-central (FCC4H, FCC6H, C2, C4, C6, CCP4H, CCP6H), left-parietal (TPP7H, CPP5H, CPP3H, P7, P5, P3, PPO5H), mid-parietal (CPP1H, CPP2H, P1, PZ, P2, PPO1, PPO2H), and right-parietal (CPP4H, CPP6H, TPP8H, P2, P4, P6, PPO6H).
If we allow bumps to be placed as close as they can, an HSMM-MVPA can pick up two distinct bumps with high correlation to each other that come from the same cognitive process. This can occur either in the presence of alpha ringing (Makeig 2002), where event-related potentials give lasting oscillations that are fit to more than one bump, or when the underlying width of the bump is much wider than the assumed 50 ms. This is not problematic as long as one can identify the correct interpretation of each stage afterwards. However, in this study, each condition is composed of a different subset of all the possible stages. In order to apply an HSMM-MVPA to all four conditions simultaneously, one need to establish correspondence of interpreted stages across conditions. In this case, a repetitive stage in one condition may cause the omission of another stage in the same condition or misalignment of stages across conditions. To prevent this from happening, we add constraint to the model so that placement of closely spaced bumps with high correlation is restricted. In particular, if the flat intervening between 2 bumps has the minimum of 5 samples (50 ms) the bumps must have zero correlation. If the bumps are more than M samples apart, there is no constraint on their correlation. In between 5 and M samples, the maximum possible correlation is (#samples − 5)/(M − 5). Parameter M is chosen in a cross-validation framework, with even trials in each subject used for training and odd trials for testing. A model is first trained on even trials of (N − 1) subjects and then tested on odd trials of the left-out subject. We prefer a model that does not give closely spaced bumps (i.e., large M), unless there is evidence that there is a smaller M value that generalizes well to a majority of subjects. By assuming the set of stage composition for each condition described in Fig. 6b, M = 16 is the largest M that is not worse than a smaller M value in 15 or more subjects (two-tail sign-test, n = 20). Therefore, we adopt a HSMM in our study with M = 16. It is also the point beyond which there is sharp drop in model likelihood averaged across 20 subjects when M increases.
The leave-one-subject-out framework aims to obtain a model with generalizability across subjects. Splitting even and odd trials for training and testing ensures that the model evaluation on testing data in each subject is independent with that in another subject.
The model used in Fig. 6b (model 1) explains the additivity in total time by the assumption that this reflected the insertion of additional stages. We explored whether it was necessary to insert additional stages or whether we could account to the additivity by allowing stage duration to vary. We considered the two alternative models model 2a and model 2b. In model 2a, the calculation state has been eliminated and there is a Combined state but that Combined state can have a different duration when a calculation is also required (i.e., conditions. Calc and Both) than when it is not (i.e., conditions Null and Mem). In model 2b, the Substitution stage has also been eliminated and a separate stage duration is estimated for each of the four conditions. In a LOOCV comparison, model 1 outperforms model 2a in 16 out of 20 subjects (two-tailed sign test, p = .01) and model 2b in 17 out of 20 subjects (two-tailed sign test, p = .002).
We have considered equating the number of response options between conditions with and without calculation. However, given that responses are insensitive to ordering in conditions with calculation (e.g., both 3 × 5 and 5 × 3 gives 1 5), it would be necessary to include twice as many different problems. This would likely have greater effects on the durations of multiple different processing stages.
This research was supported by the National Science Foundation Grant 1420009, the James S. McDonnell Foundation Scholar Award 220020162, and the Office of Naval Research Grant N00014-15-1-2151 to J. R. A. and M. M. W.
- Anderson, J. R. (2007). How can the human mind occur in the physical universe? Oxford University Press.Google Scholar
- Basar, E. (1980). EEG brain dynamics: relation between EEG and brain evoked potentials. Amsterdam: Elsevier.Google Scholar
- Delorme, A., & Makeig, S. (2004). EEGLAB: An open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. Journal of neuroscience methods, 134(1), 9–21.Google Scholar
- Massaro, D. W. (1989). Experimental psychology: an information processing approach. Harcourt Brace Jovanovich.Google Scholar
- Mestre, M. R., Godsill, S. J., & Fitzgerald, W. J. (2014). Bayesian detection of single-trial event-related potentials. In Acoustics, Speech and Signal Processing (ICASSP), 2014 I.E. International Conference on (pp. 4693–4697). IEEE.Google Scholar
- Newell, A. (1990). Unified theories of cognition. Harvard University Press.Google Scholar
- Pachella, R. G. (1973). The interpretation of reaction time in information processing research (No. TR-45). Michigan Uni Ann Arbor Human Performance Center.Google Scholar
- Pauli, P., Lutzenberger, W., Rau, H., Birbaumer, N., Rickard, T. C., Yaroush, R. A., & Bourne Jr, L. E. (1994). Brain potentials during mental arithmetic: effects of extensive practice and problem difficulty. Cognitive Brain Research, 2(1), 21–29.Google Scholar
- Pauli, P., Lutzenberger, W., Birbaumer, N., Rickard, T. C., & Bourne, L. E. (1996). Neurophysiological correlates of mental arithmetic. Psychophysiology, 33(5), 522–529.Google Scholar
- Townsend, J. T., & Ashby, F. G. (1983). Stochastic modeling of elementary psychological processes. CUP Archive.Google Scholar
- Yeung, N., Bogacz, R., Holroyd, C. B., Nieuwenhuis, S., & Cohen, J. D. (2007). Theta phase resetting and the error-related negativity. Psychophysiology, 44, 39–49. https://doi.org/10.1111/j.1469-8986.2006.00482.x.
- Yu, S. Z. (2010). Hidden semi-Markov models. Artificial Intelligence, 174, 215–243.Google Scholar
- Zhang, Q., Vugt, M., Borst, J. P., & Anderson, J. R. (2018). Mapping working memory retrieval in space and in time: a combined electroencephalography and electrocorticography approach. NeuroImage, 472–484. https://doi.org/10.1016/j.neuroimage.2018.03.039.