Successful text comprehenders construct an integrated, coherent, and accurate mental representation of the state of affairs described by the text. The construction of this situation model requires the reader to go beyond a representation of the surface characteristics of the text by generating inferences and incorporating world knowledge from long-term memory (Kintsch & van Dijk, 1978). The construction of the situation model is a dynamic process (e.g., Kintsch, 1998; McNamara & Magliano, 2009; Rapp & van den Broek, 2005); the text is processed incrementally (e.g., word by word and sentence by sentence) and, therefore, the situation model is constantly being updated as the text unfolds. As each new piece of information is processed, it must be integrated with the mental representation constructed so far. This involves monitoring for comprehension to identify when and where additional processing, such as inference generation, is necessary to ensure coherence (Kintsch, 1998; Perfetti, Stafura, & Adlof, 2013).

Comprehension monitoring is the metacognitive awareness that readers have about what they are reading (Wagoner, 1983). Baker (1985) distinguished two monitoring phases: evaluation and regulation. Evaluation, more recently defined as validation (Singer, 2013), refers to the process that allows readers to detect an inconsistency or mismatch in the text (e.g., Vauras, Kinnunen, Salonen, & Lehtinen, 2008). Current evidence on this process in adult readers converges on the view that evaluation is a routine, passive, and nonstrategic reading activity that depends on both the activation of current information and the integration of that information with previous text information or world knowledge (Kendeou, 2014). On the other hand, regulation is associated with the repair processes that are necessary to incorporate the new information into the current memory representation (e.g., O’Brien, Rizzella, Albrecht, & Halleran, 1998). According to Hacker (1998), comprehenders self-regulate their reading by asking themselves questions (evaluation) and updating their situation model (e.g., revising inconsistent information). Updating is a type of regulation that includes a broad category of processes and for coherent texts it is achieved with little cognitive cost. When a mismatch between the current situation model and new information is detected in the evaluation phase, the updating process will involve more than simply integrating just-read information into the situation model. In such instances, readers may revise the situation model by modifying or replacing information. Revision of the situation model can only be achieved if readers are able to adequately regulate their comprehension. Hereafter, we refer to regulation as a revision process.

The revision process is clearly a specific updating activity that involves the inhibition of an interpretation that was encoded into the situation model in favour of the new information (Rapp & Kendeou, 2007). Interestingly, both the evaluation of mismatches and the revision of no longer relevant information can occur at an inferential processing level. For example, if the text supports the generation of a specific inference (e.g., ‘A mouse was looking for something to eat while a bigger animal was waiting to hunt it,’ which supports the inference of ‘cat’), only readers who generate that inference will be able to detect a subsequent mismatch (e.g., ‘The dog jumped out and scared the mouseFootnote 1). Readers who detect that mismatch should then revise their situation model by replacing ‘cat’ with ‘dog’ to ensure that the situation model is an accurate representation of the text. The need to revise may not always be triggered by an explicitly stated concept (such as ‘dog’ in the previous example). A continuation that invites an inference (e.g., ‘The bigger animal barked loudly…’) would also require readers to revise the earlier inference that the animal was a ‘cat.’ We have called this process inferential revision, and a deeper understanding of how individual differences in working memory affect this process is the main goal of the present study.

A number of different paradigms show that while readers easily incorporate new information into their situation model (e.g., Rapp & Kendeou, 2007; Rapp & Taylor, 2004), they do not always successfully revise their mental representation when new information contradicts previously stated information (e.g., Guéraud, Harmon, & Peracchi, 2005; O’Brien, et al., 1998; Rapp & Kendeou, 2007, 2009). In a classic example, O’Brien et al. (1998) found that participants took longer to read a sentence regarding a person’s behavior when that behavior contradicted earlier information (e.g., reading ‘Mary ordered a cheeseburger and fries’ after ‘Mary, a health nut, had been a strict vegetarian for 10 years’). This finding indicates that although participants detected the inconsistency between the character’s behavior and earlier information, they experienced difficulty integrating the new information into their mental representation. This comprehension difficulty was reduced, but still evident, in a qualified condition that provided an additional explanation for the character’s behavior encouraging a revision of the situation model (‘Nevertheless, Mary never stuck to her diet when she dined out with her friends’). If participants had successfully revised their situation model to incorporate this qualification, there would have been no comprehension difficulty. Thus, when new information is inconsistent with prior parts of the text, successful understanding requires the revision of the situation model.

One reason for a failure to revise the situation model is that readers have problems replacing outdated information (Kendeou, Smith, & O’Brien, 2013; O’Brien, Cook, & Guéraud, 2010). The Knowledge Revision Comprehension framework (KReC; Kendeou & O’Brien, 2014) proposes that once information is retrieved from long-term memory, the activation of the new information competes with the no-longer-relevant information, drawing activation away from the now-outdated information. Importantly, if the activation of the outdated information is not sufficiently reduced by the competition mechanism, this information may interfere with the new information, making difficult the revision of the situation model. Empirical support for this comes from studies of children and adults with poor text comprehension. These studies show a relationship between poor comprehension and difficulties with working memory, a key cognitive resource that supports the general processes involved in the construction of the situation model. Critically, these working memory difficulties are associated with deficient suppression or inhibitory control over the contents of working memory (Cain, 2006; Carretti, Corndoldi, De Beni, & Palladino, 2004; Pimperton & Nation, 2010). For example, Carretti et al. (2004) found that good comprehenders had higher working memory capacity than poor comprehenders and were also better at recalling words in a categorization task: they made fewer intrusion errors (words that had been categorized but were not listed as final words and so should not be recalled). These findings suggest that the relationship between poor reading comprehension and poor working memory may be related to difficulties that poor comprehenders have with inhibiting irrelevant information.

These studies linking poor comprehenders’ working memory capacity to problems with inhibition also fit well with Gernsbacher’s model of text comprehension: the Structure-Building framework (Gernsbacher, 1990, 1997). This framework proposes that readers with low working memory capacity have problems with text comprehension because of difficulties with suppressing irrelevant information. As a result, they generate new substructures rather than integrating new information into the situation model and therefore produce a less coherent situation model than good comprehenders (e.g., Gernsbacher & Faust, 1991; Gernsbacher, Varner, & Faust, 1990). This body of work provides both empirical and theoretical support for the proposal that readers with good working memory may be better able to revise their situation model than readers with poor working memory, because they are more efficient at inhibiting no-longer-relevant information, an essential process to construct an accurate and coherent situation model.

Surprisingly, there are very few studies investigating how individual differences in working memory are associated with the process of inferential revision. Indirect evidence for such an association comes from a study by Dutke and von Hecker (2011). They investigated how working memory capacity affects the process of revising the situation model when ambiguous information is read. They presented narrative texts about the social relations between protagonists to adult readers with high or low working memory span. High-span readers were better able than low-span readers to inhibit an earlier representation (e.g., ‘Franco and Salvatore relied on each other’) that was incompatible with new information (e.g., ‘Franco found Salvatore’s anxieties very annoying’). This suggests that an individual’s memory capacity is related to their ability to revise a situation model. However, although Dutke and von Hecker’s (2011) study focused on the structure of social relations described in the text to investigate inferential revision, they did not use moment-by-moment processing measures to study revision during the construction of the situation model. Moreover, in a separate study, Dutke and colleagues found evidence indicating that some of the situation model revision took place after reading the text (Dutke, Baadte, Hähnel, von Hecker, & Rinck, 2010), which leaves open the question of whether the revision process may occur when reading the text (that is online) under certain circumstances.

From our point of view, it is important to understand the time course and accuracy with which readers revise inferential information in their situation model, to elucidate reading comprehension problems at a high level of processing. Thus, the present study aims to investigate the inferential revision process in two important ways. First, we developed a paradigm that enabled us to dissociate the two key components of comprehension monitoring described earlier: the detection of a mismatch (evaluation process) and the updating of no longer relevant information (revision process) at an inferential processing level. Second, we examined how individual differences in working memory were associated with the process of revising the situation model. To explore these issues, we recorded reading times and electrophysiological brain activity during online reading.

Behavioral measure

Reading times

Reading times are an established way to assess processing difficulty. Readers typically take longer to read a sentence if they detect a mismatch between the text and what they have read previously (O’Brien et al., 1998), and they also take longer to read critical sentences when inferential processing is required (McKoon & Ratcliff, 1992). Our texts contained a sentence that either prompted readers to make a revision or not to the current situation model. Reading times for this sentence were compared across these conditions to determine whether participants had successfully evaluated the new information against the existing situation model. The reading times did not enable us to establish whether or not participants had actually revised their situation model and replaced the now-incorrect inference with the new correct inference. Because event-related potentials (ERPs) are a robust means to study the precise time course of many cognitive processes, we examined distinct ERP components to a subsequent critical word in the text to investigate the revision process.

ERP measures

P3a and P3b subcomponents

A relevant theoretical framework for the purposes of the present experiment is the context-updating theory (Polich, 2003, 2007). This framework distinguishes two subcomponents of the P300: a central–frontal positivity, or ‘P3a’ (e.g., Debener, Makeig, Delorme, & Engel, 2005), which is evident when incoming information is evaluated as new or different with respect the current representation demanding attentional control; and a temporo–parietal positivity, or ‘P3b’ (e.g., Hartikainen & Knight, 2003), which has been found when the context of the incoming stimulus involves updating by memory processes. Generally, the P3a is assumed to reflect mechanisms of attentional orientation driven by a target or novel stimulus (see Friedman, Cycowicz, & Gaeta, 2001, for review), whereas the P3b is related to processing capacity, being affected by the allocation of cognitive resources, the relevance of the stimulus to the task, and the probability of the stimulus (e.g., Kok, 2001). These findings suggest the existence of a brain circuit encompassing a) a top-down, stimulus-driven process that takes place in frontal areas (P3a) and b) a bottom-up, memory-driven process, which is guided by updating operations and occurs in parietal areas (P3b; see Polich, 2003). Accordingly, we used the P3a as an additional index of the detection of mismatches (evaluation process) and the P3b as an index of the updating of no-longer-relevant information (revision process).

Moreover, some studies have demonstrated a selective relationship between the reduction of the P3b amplitude and poor execution in several capacities, such as comprehension monitoring (Getzmann & Falkestein, 2011) and working memory capacity (Evans, Selinger, & Pollak, 2011). In support of this distinction, and of relevance to the current study, is work by Getzmann and Falkestein (2011). They compared younger (19–25 years) and older (54–64 years) adults’ performance on a comprehension monitoring task in which participants had to respond according to the stock price of a specific company, while ignoring other prices and beep sounds. Interestingly, participants did not show significant differences in the behavioral results (accuracy and reading times), but electrophysiological differences emerged. Specifically, the older adults manifested an increased right-frontal P3a (only in high-performing older adults) and a reduced parietal P3b relative to the younger adults. The authors interpreted the P3a result as an age-related compensatory mechanism and the P3b result as an effect of age-related decline in spoken-language comprehension. These data signal that the P300 subcomponents may reflect individual differences in language comprehension.

N400 component

Another ERP of interest is the N400, which is an index of the ease with which the meaning of a word can be integrated into the current situation model (see Kutas & Federmeier, 2009, 2011). The amplitude of the N400 is reduced when there is a good fit between the word being processed and the context, in comparison to a poorly fitting word. For example, Kuperberg, Paczynski, and Ditman (2011) demonstrated a N400 for words causally unrelated to an inference supported by the text (e.g., ‘Jills skin always tanned well. She always put on sunscreen. She had sunburn on Monday.’) compared to causally related words (e.g., ‘Jill had very fair skin. She forgot to put sunscreen on. She had sunburn on Monday.’). However, although there have been several electrophysiological studies demonstrating a relation between working memory and inference making (e.g., St. George, Mannes, & Hoffman, 1997) and in evaluation of coherence breaks (e.g., Virtue, Haberman, Clancy, Parrish, & Jung-Beeman, 2006), no study to date has investigated the relationship between working memory and the revision of the situation model using ERPs. Therefore, a second aim of the present study was to explore if these ERP components (P3a, P3b, and N400) reflected individual differences in working memory associated with the construction of the situation model.

The current study

To address our aims, we developed the situation model revision task (see Table 1). Participants read short texts in which Sentences 1 through 3 provided an introduction for which at least two different concepts could be inferred by means of the generation of knowledge-based elaborative inferences (McKoon & Ratcliff, 1980). Both concepts were plausible, but one was considered more likely (e.g., ‘guitar’) by independent judges (see Table 1). There were three versions of the subsequent Sentence 4: a neutral condition, which did not refer directly or indirectly to either concept; a no-revise condition, which mentioned a property consistent with either concept (e.g., ‘…beautiful curved body’); and a revise condition, which referred to a property that was consistent with only the less likely concept (e.g., ‘…matching bow’). This latter condition should prompt readers to revise the situation model to ensure good comprehension. Reading times were measured for this sentence. The final word in Sentence 5 was always inconsistent with the concept supported in the introduction but consistent with the concept supported in the revise condition (e.g., ‘violin’). This word was called the disambiguating word, and ERPs were recorded here.

Table 1 Example of text used in the situation model revision task

Our predictions were as follows. First, in relation to the behavioral data, if readers generate and encode the inference supported by the introduction (‘guitar’), then they will show longer reading times for Sentence 4 in the revise condition (‘matching bow’) compared to the neutral (‘national concert hall’) and no revise (‘beautiful curved body’) conditions. This effect would signal the ability to detect a mismatch (evaluation process) when new information does not match the current situation model (e.g., Bohn-Gettler, Rapp, Van den Broek, Kendeou, & White, 2011). As noted, longer reading times do not enable us to establish if readers are able to replace the incorrect inference with the alternative inference when prompted by Sentence 4 in the revise condition. The ERP data, registered for the disambiguating word of Sentence 5, help us to understand whether readers not only detect a mismatch when reading Sentence 4 in the revise condition, but also whether they successfully revise their situation model, as detailed below.

In line with Polich (2003, 2007), the P3a subcomponent shows if a word is evaluated as new or different with respect to the current mental representation. Thus, if readers activate the alternative inference when reading Sentence 4 in the revise condition, then they will exhibit a reduction of the P3a to the disambiguating word (‘violin’) in the revise condition (‘matching bow’) compared with the neutral (‘national concert hall’) and no revise (‘beautiful curved body’) conditions. This pattern would signal not only the activation of the new inference when reading Sentence 4 in the revise condition but also the mismatch detection for Sentence 5 in the neutral and no revise conditions. The P3b subcomponent indicates if a word prompts a revision of the situation model. Therefore, similar to the P3a, if readers are able to update the alternative inference and draw activation away from the previous incorrect inference when reading Sentence 4 in the revise condition, they will show a reduction of the P3b to the disambiguating word in the revise condition compared with the neutral and no revise conditions. This effect would demonstrate that readers not only activate the new inference when reading Sentence 4 in the revise condition but they are also able to revise their mental representation. Finally, the N400 component is an index of the ease with which information can be integrated into a reader’s situation model. If readers integrate the alternative inference when reading Sentence 4 in the revise condition, then they will demonstrate a reduction of the N400 to the disambiguating word in the revise condition compared with the neutral and no revise conditions. This result would indicate that readers are able to integrate the new inference into their situation model when reading Sentence 4 in the revise condition.

Furthermore, if readers with low working memory are less able to evaluate their comprehension and revise their situation model than readers with high working memory, they will not manifest significant differences between conditions in the disambiguating word compared with high working memory readers, who will. However, because to our knowledge this is the first ERP study investigating individual differences in the inferential revision process, we did not make specific hypotheses about the electrophysiological components for high and low working memory readers. Importantly, our paradigm enables us to combine behavioral and electrophysiological data to understand better how working memory relates to a reader’s ability to evaluate and inferentially revise their situation model.

Method

Participants

Seventy-seven people living in the city of Granada (Spain), with a mean age of 22.5 years (range: 18–37 years) were recruited by an Internet advertisement to participate for payment. All were native English speakers and gave their consent to participate in the experiment. After they performed the two memory tasks (memory updating and reading span), only participants with extreme working memory scores (see below) were invited to complete the situation model revision task.

Materials

Memory updating task

We developed an English version of Carretti, Belacchi, and Cornoldi’s (2010) memory updating task. Participants read lists of words, one word at a time. The number of words in the lists increased from 2 to 12 as the trials progressed. The words were concrete nouns referring to objects of different sizes (large or small, e.g., ‘ship’ or ‘pea’). The task was to recall the smallest object or objects in the list, according to their physical size. The number of words to be recalled was stated before each list and increased from one to five, with a fixed presentation order. Participants were required to a) activate and maintain each new word in working memory to compare its size with previously presented words, b) maintain activation of the smallest objects in the specified set size, and c) inhibit any previously activated words that no longer meet the criteria (i.e., to inhibit a large-size object when they heard the name of a smaller object). Therefore, the recall set of words had to be constantly revised as new words were presented. All participants completed all trials.

Reading span task

We used a version of Daneman and Carpenter’s (1980) reading span task. Participants read sets of sentences presented one by one and were required to recall the last word of each sentence, at the end of each set of sentences. The order of recall was not important, but participants could not start with the last word of the last sentence. There were five levels, increasing in difficulty from two to six sentences. A level was considered correct if participants recalled correctly each last word of at least three out of five (maximum) sets of sentences.

The score for both memory tasks was the total number of words correctly recalled minus the total number of words incorrectly recalled (intrusions). These scores were used to classify participants into the high and low working memory groups, using the criterion of being above or below (respectively) the mean of the total score in both working memory tasks (see scores below).

Situation model revision task

We constructed 93 (3 practice, 90 experimental) five-sentence narrative texts, some modified from texts used by Lorsbach, Katz, and Cupak (1998). An example is shown in Table 1 (see the full set of materials in Appendix A). The first three sentences supported a specific inference to be made (e.g., ‘guitar’). There were three versions of Sentence 4: the neutral condition, which did not refer back to either the supported or the alternative inference and, therefore, was neither consistent nor inconsistent with the introduction; the no revise condition, which was consistent with the inference primed by the introduction; and the revise condition, which prompted readers to revise their situation model so that only the alternative inference was encoded, rather than the inference supported by the introduction. Reading times were the dependent variable for this sentence. Sentence 5 concluded with a disambiguating word (e.g., ‘violin’), which was always incongruent with the inference supported by the introduction and congruent with the inference supported by Sentence 4 in the revise condition. Consequently, the disambiguating word was unexpected in the neutral and no revise conditions, and expected in the revise condition. At the end of the text, a comprehension sentence requiring a true or false judgment was included to encourage participants to read for meaning.

A norming study provided empirical confirmation of concept preferences in our situation model revision task. Twenty-two participants (M = 22.7 years old; range: 18–55) read the introduction of each text (Sentences 1–3) and were then presented with a single word. Their task was to decide (yes/no) if the word fitted with the sense of the story. The word was either the target concept, which was most strongly supported by the introduction (e.g., ‘guitar’), the alternative concept (e.g., ‘violin’), or a nonstory concept (e.g., ‘poker’). Results of a one-way ANOVA performed on the percentage of accuracy showed a main effect of concept type, F(2, 42) = 92.92, p < .001, 2 = .82, because participants were more likely to correctly accept the target concept (M = 83.95, SD = 7.71) and to correctly reject the nonstory concept (M = 88.95, SD = 7.24) than to accept the alternative concept (M = 50.43, SD = 12.43). Furthermore, when participants did accept the alternative concept, they took longer to do so (M = 2079 ms) compared with response times to the target concept (M = 1612 ms): t(21) = 3.72, p < .001.Footnote 2 This difference suggests that, after reading the introduction, the target concept was significantly more likely to be activated than the alternative concept, as intended. It is important to acknowledge that there was variability in the extent to which our 90 texts constrained the activation of the target concept in the introduction. A second norming study with a two-alternative forced choice task confirmed that the two critical concepts were both supported by Sentence 4 (e.g., ‘guitar’ for the no revise, and ‘violin’ for the revise). Fourteen participants (M = 20.9 years old; range: 18–26) read the introduction, followed by one of the two versions of Sentence 4. They were instructed to mark the concept that the text was about. Seven participants completed each version of each text. In the final study, we included only texts for which the appropriate word was selected in both versions by a minimum of five participants. The sample used in the norming studies did not take part in the main study.

The word frequency for each of the two critical concepts was examined using the Word Frequency Guide database (Zeno, Ivens, Millard, & Duvvuri, 1995) and did not differ (Ms = 56.58 and 47.18, for the no revise and revise concepts, respectively), t(89) = 0.27, p = .79. The word length of Sentence 4 did not differ between conditions (Ms = 11.70, 11.46, and 11.81, for the neutral, no revise, and revise conditions, respectively), F(2, 178) = 1.74, p = .18. Finally, although we tried to minimize nonmanipulated differences between our conditions, the structure for Sentence 4 varied across conditions.

Procedure

Materials were administered in two sessions. The first session took approximately 30 minutes and included the two memory tasks. The memory updating task was administered first. Before each word list, participants were informed of the number of words in the list and how many objects to recall. Each word was presented on a computer screen for 2 seconds. A question mark prompted recall, and the participant said their response out loud. A practice trial preceded the experimental trials. The reading span task was completed next. Participants were instructed to recall the last word of each sentence and, before each block, they were informed of the number of sentences (and words to recall) in the trial. Participants read each sentence at their own pace. At the end of the trial, a white screen appeared, and participants said aloud the words that they could remember. A practice trial preceded the experimental trials.

Before the second session, the scores of both working memory tasks were used to divide participants into two groups: 18 low and 18 high working memory readers. The mean number of words recalled for the low memory group was 21.11 (SD = 2.74; range = 16–24) in the memory updating task and 29.50 (SD = 8.03; range = 16–44) in the reading span task; and for the high memory group was 26.39 (SD = 1.50; range = 24–29) in the memory updating task and 68.39 (SD = 12.10; range = 47–86) in the reading span task. T-tests confirmed significant group differences in both memory tasks: updating, t(34) = 7.17, p < .001; and reading span, t(34) = 11.36, p < .001.

In the second session, participants completed the situation model revision task. This session took approximately 90 minutes and included only participants with low and high working memory. First, we placed the electrode cap onto the participant’s head to record the EEG. Each trial started with a fixation point (‘+’) that remained on the screen until the participant pressed the ‘B’ key on the keyboard to present the first sentence. Sentences 1–4 were presented one sentence at a time, and participants were instructed to read each sentence at their own pace, pressing the space bar to display the next sentence. The reading time of Sentence 4 (neutral vs. no revise vs. revise) was recorded. Immediately after, Sentence 5 was presented word by word with a fixed duration of 300 ms per word. In addition, there was a delay of 700 ms after the disambiguating word to ensure that the electrophysiological activities of the ERPs were registered. To prevent excessive noise in the electrophysiological data, we asked participants to try not to blink during the presentation of Sentence 5. Finally, participants were presented with a true/false comprehension sentence. This sentence always referred to information in the introduction (equally distributed across Sentences 1–3). Participants pressed the designated true or false key to respond.

Each of 90 experimental texts was presented to each participant only once in one of the three conditions, counterbalanced across participants. The task was administered in three blocks, keeping the same proportion (10 texts) in each condition per block. The same number of participants completed each condition, and the presentation of texts was randomized within block. A practice of three trials ensured that instructions were understood.

Apparatus

All tasks were presented by the E-prime software (Schneider, Eschman, & Zuccolotto, 2002), administered on a 19 in. CRT video monitor (refresh rate = 75 Hz). For the situation model revision task, scalp voltages were recorded from a SynAmps2 64 channels Quik-Cap, plugged into a Neuroscan SynAmps RT amplifier. The electrical signal was amplified with a 1–30 Hz band-pass filter and a continuous sample rate of 250 Hz. Ocular movements and blinks were also collected by two pairs of channels: a) the vertical electrooculogram situated in the left eye of the participant, with one electrode supra and another infraorbitally to measure blink artifact; b) the horizontal electrooculogram placed in the external canthi, with one electrode on the left and another on the right side to register eye movements. Impedances were kept below 5 kΩ. Both blinks and ocular movements were corrected. In addition, trials with artifacts were rejected (3.12 %) and, in those cases where electrodes had a high level of artifacts (>1 %), these were substituted by the average value of the group of nearest electrodes. Epochs with an interval between -200 and 800 ms with respect to the presentation of the target word (disambiguating word) were averaged and analyzed. Baseline correction was applied using the average EEG activity in the 200 ms preceding the onset of the target as a reference signal value. Separate ERPs averages were developed for each condition for each participant. Individual averages were re-referenced off-line to the average of left and right mastoids. Six regions of interest (ROI) were extracted from the 64 channels (see Figure 1), keeping the criteria of 1) symmetry between hemispheres and 2) same number of electrodes (five sites)Footnote 3: left frontal, or LF (F1, F3, F5, FC3, and FC5); right frontal, or RF (F2, F4, F6, FC4, and FC6); central, or C (C1, C2, CZ, FCZ, and CPZ); left parietal, or LP (P1, P3, P5, CP3, and CP5); right parietal, or RP (P2, P4, P6, CP4, and CP6); and occipital, or O (O1, O2, POZ, PO3, and PO4).

Fig. 1
figure 1

The six regions of interest (ROI): left frontal (LF); right frontal (RF); central (C); left parietal (LP); right parietal (RP); and occipital (O)

Statistical analyses

We report statistical analyses of 36 participants for all trials.Footnote 4 Working memory group was a between-subjects factor in all analyses. The behavioral analysis of the situation model revision task was conducted on reading times (milliseconds) per sentence. In the ERP analyses, the critical time windows were predefined by visual inspection. In this way, the mean amplitude was calculated in the window of 220–300 ms (P3a and P3b) and the window of 300–550 ms (N400) after the disambiguating word onset (see Figure 2). Outlier amplitude data per condition, group and ROI was detected by the Box-Whisker plot, and replaced by the mean for both the P300 (3.70 %) and the N400 (2.47 %).

Fig. 2
figure 2

Graphical representation of the mean amplitude (in microvolts) for the P300 (pale gray column) and the N400 (dark gray column) components, divided by working memory group, condition and ROI

Results

Behavioral analysis (Sentence 4)

Reading times

To understand if readers generated the inference in the introduction and then evaluated their comprehension by detecting a mismatch in the revise condition, we performed a mixed model ANOVA with working memory group (high vs. low) and condition (neutral vs. no revise vs. revise) on the reading time of Sentence 4. There was only a main effect of condition, F(2, 68) = 11.27, p < .001, 2 = .25, where the revise condition resulted in longer reading times (M = 3076 ms) than the other two conditions: neutral (M = 2801 ms), and no revise (M = 2714 ms). T-tests revealed that the revise condition significantly differed from the neutral, t(35) = 3.01, p = .005, and the no revise, t(35) = 4.21, p < .001, conditions. The comparison between the neutral and the no revise condition was not significant, t(35) = 1.43, p = .17. The memory group effect, F(1, 34) = 1.82, p = .19, and the memory group × condition interaction, F(2, 68) = 1.82, p = .17, were not significant (see Table 2 for means).Footnote 5

Table 2 Reading time means for Sentence 4 of the situation model revision task, divided by working memory group and condition

ERP analysis (disambiguating word)

First, in order to see if both subcomponents of the P300 (P3a and P3b) could be distinguished in our data, we carried out a mixed model ANOVA with working memory group, condition and ROI on the mean amplitude data in the time window of 220–300 ms, dividing ROI in central-frontal (C, LF, and RF) and posterior (LP, RP, and O) regions. The analysis showed a tendency toward a larger positivity in the high memory group compared to the low memory group, F(1, 34) = 3.19, p = .08, 2 = .09. There was a significant main effect of condition, F(2, 68) = 6.17, p < .01, 2 = .15, with more positive amplitude in the neutral and no revise conditions compared to the revise condition. There was also a main effect of ROI, F(1, 34) = 174.67, p < .001, 2 = .84, because the central-frontal regions were significantly more positive than the posterior regions. Critically, although no two-way interaction reached significance (all ps > .28), the three-way interaction was significant, F(2, 68) = 7.26, p < .01, 2 = .18. Therefore, we conducted separate analyses for the P3a (C, LF, and RF) and the P3b (LP, RP, and O) subcomponents.

P3a analysis

Our aim was to see if readers generated the alternative inference on reading Sentence 4 in the revise condition and then, evaluated the disambiguating word as already activated. To do this, we performed a mixed model ANOVA with working memory group, condition, and the three ROIFootnote 6 associated with the P3a (LF, RF, and C) on the mean amplitude data (for the disambiguating word) in the time window of 220–300 ms. As before, there was a tendency toward a larger positivity in the high memory group compared to the low memory group, F(1, 34) = 3.36, p = .08, 2 = .09. The main effect of condition was significant, F(2, 68) = 3.87, p = .03, 2 = .10, where, as predicted, the amplitude for the disambiguating word following the neutral and no revise conditions of Sentence 4 was larger (M = 2.51, SD = 1.25 and M = 2.54, SD = 1.17, respectively) than that found in the revise condition (M = 2.02, SD = 1.58). There was also a main effect of ROI, F(2, 68) = 5.83, p = .006, 2 = .15, with larger positivity in the C and RF regions than in the LF region. No interactions were significant (all ps > .35).

P3b analysis

To determine if readers revised their mental representation reducing activation from the previous incorrect inference on reading Sentence 4 in the revise condition, we performed a third mixed model ANOVA with working memory group, condition, and the three ROI related to the P3b (RP, LP, and O) on the mean amplitude data (for the disambiguating word) in the same temporal window. The main effect of memory group did not reach significance, F(1, 34) = 1.00, p = .33. There was a significant effect of condition, F(2, 68) = 7.42, p = .002, 2 = .18, because as predicted the amplitude in the neutral and no revise conditions was more positive than in the revise condition. There was also a main effect of ROI, F(2, 68) = 72.11, p < .001, 2 = .68, because the two parietal regions (LP and RP) were significantly more positive than the O region. In addition, there was a significant two-way interaction between memory group and condition, F(2, 68) = 3.79, p = .03, 2 = .10. No other interactions reached significance (all ps > .10).

To identify the locus of the interaction between memory group and condition (see Figure 3), planned comparisons between conditions were carried out for each group separately, with a Bonferroni correction setting the alpha at .008. For the high memory group, significant differences between the revise condition and both the neutral and no revise conditions were apparent, t(17) = 4.02, p < .001, and t(17) = 3.13, p = .007, respectively; whereas the neutral and no revise conditions did not differ t(17) = 1.26, p = .22. A different pattern was apparent for the low memory group: none of the contrasts reached significance (all ps > .44).

Fig. 3
figure 3

Mean amplitude of the P3b subcomponent for the disambiguating word of the situation model revision task, divided by working memory group and condition

N400 analysis

To see if readers integrated the alternative inference on reading Sentence 4 in the revise condition, we performed a final mixed model ANOVA with working memory group, condition and the six ROI on the mean amplitude data (for the disambiguating word) in the time window of 300–550 ms. The main effect of memory group did not reach significance, F(1, 34) = 0.91, p = .35. There was a main effect of condition, F(2, 68) = 21.84, p < .001, 2 = .39, because as predicted the amplitude in the neutral and no revise conditions was more negative than in the revise condition. There was also a tendency toward a main effect of ROI, F(5, 170) = 2.65, p = .07, 2 = .07, with less negativity in the LP region. In addition, there were two significant interactions. The first two-way interaction between condition and ROI, F(10, 340) = 3.94, p = .001, 2 = .10, arose because the neutral and the no revise conditions were more negative than the revise condition, particularly in the RP region, t(35) = 7.23, p < .001, and t(35) = 6.60, p < .001, respectively. The second two-way interaction between memory group and condition, F(2, 68) = 3.85, p = .03, 2 = .10,Footnote 7 is of specific interest to our understanding of the integration process and therefore, is discussed in detail below. A further two-way interaction between memory group and ROI showed a tendency toward significance, F(5, 170) = 2.08, p = .07, 2 = .06, with larger negativity for the high compared to the low working memory group in the posterior regions (LP, RP, and O). Finally, the three-way interaction was not significant, F(10, 340) = 1.29, p = .23.

The two-way interaction between memory group and condition (see Figure 4) was explored further to understand the integration process. Planned comparisons between conditions for each memory group separately were used to identify the locus of this interaction, again with a Bonferroni correction setting the alpha at .008. Only the high memory group showed more negative amplitude in the no revise condition compared to the revise condition. Specifically, this group showed larger negativity in the neutral condition, t(17) = 6.80, p < .001, and the no revise condition, t(17) = 6.02, p < .001, compared to the revise condition. The neutral and the no revise conditions did not differ, t(17) = 0.19, p = .85. In contrast, the low memory group showed larger negativity in the neutral compared with the revise condition, t(17) = 3.44, p = .003, but there was no difference between the no revise and the revise condition, t(17) = 1.46, p = .16; nor between the neutral and the no revise condition, t(17) = 1.49, p = .15.

Fig. 4
figure 4

Mean amplitude of the N400 component for the disambiguating word of the situation model revision task, divided by working memory group and condition

Discussion

The goal of this study was to investigate the dynamics of inferential revision in adults’ reading comprehension, using both behavioral and electrophysiological measures. To do so, we created a bespoke reading comprehension paradigm: the situation model revision task. In this, the introduction (Sentences 1–3) provided a general context that facilitated at least two plausible inferences, one of which was more likely than the other. Sentence 4 was either neutral, did not require a revision (no revise) because it was inferentially consistent with the most likely concept, or did prompt a revision (revise) because the description prompted an inference that was consistent with only the less likely concept. Our behavioral results indicated that all participants took longer to read this sentence in the revise compared to the neutral and no revise conditions.

The final sentence ended with the disambiguating word, which was always inconsistent with the most likely concept but consistent with less likely concept supported in the revise condition. Here, our electrophysiological results differed by the specific ERP component. There were no working memory differences in the amplitude of the P3a: both memory groups presented larger positivity in the neutral and no revise conditions compared to the revise condition. In contrast, the pattern of findings for the P3b differed by working memory group: the high memory group showed significantly larger positivity in the neutral and no revise conditions compared to the revise condition, while the low memory group did not differ between conditions. Similarly, there were working memory differences in the N400 component: the high memory group demonstrated larger negativity in the neutral and no revise conditions than in the revise condition; however, the low memory group did not show a difference between the no revise and the revise condition, although a difference was apparent between the neutral and the revise condition.

Evaluation at the inferential level

In our texts, Sentence 4 of the revise condition always mismatched the interpretation supported by the introduction. The question was then whether readers were able to detect mismatches with their current situation model even though this information was processed at the inferential level. The reading time results demonstrated a large cost for both working memory groups, suggesting that all readers detected a mismatch between the new inferable information and the inference that was supported by the introduction of the text. In addition, the difference in reading times found between the neutral and revise conditions confirmed that the initial interpretation (e.g., ‘guitar’) was inferred by readers and incorporated into the situation model. Therefore, our behavioral results signal that when a highly constrained semantic context is provided, both high and low working memory readers are equally able to infer a knowledge-based elaborative inference and subsequently detect inferential information that is incompatible with that elaboration. This finding indicates that the evaluation process of monitoring may occur at the inferential level, which is congruent with other studies demonstrating that adults are able to evaluate their inferential comprehension (e.g., Poynor & Morris, 2003). In addition, it is also consistent with the minimalist hypothesis (McKoon & Ratcliff, 1980), which has claimed that elaborative inferences are automatically encoded during reading when 1) information is quickly and easily available in memory, or 2) they are necessary to provide coherence by text information or prior knowledge. Nevertheless, as we previously mentioned, our sentences were constructed so that the mismatch was easily detected. Thus, it is possible that working memory differences could arise if more subtle mismatches were introduced in these sentences.

A less clear matter is whether readers successfully revised their situation model after reading the revise condition in Sentence 4. From our point of view, two things could be happening here. One possibility is that readers activated and encoded the alternative interpretation (e.g., ‘violin’) and reduced activation of the initial interpretation. This would reflect revision of the memory representation. Alternatively, they may have activated and encoded the alternative interpretation without reducing activation of the initial interpretation. This would reflect a lack of revision of the memory representation. The reading times by themselves only speak to the evaluation process of comprehension monitoring, and do not clarify if the revision process took place when reading Sentence 4. The electrophysiological data recorded in the disambiguating word address this critical issue.

Evaluation and revision processes: P3a and P3b

According to the context-updating theory (Polich, 2003, 2007), the P3a occurs when incoming information demands attentional control because it is evaluated as ‘new’ or ‘different’ with respect to the current memory representation; in contrast, the P3b appears when that incoming information forces subsequent attentional resources to favor context updating by memory operations. Although this theoretical framework has been developed using a traditional attentional task (oddball paradigm), our situation model revision task produced results that are consistent with this framework.

First, we found larger positivity associated with the P3a in the neutral and no revise conditions compared to the revise condition. This effect indicates that readers required greater attentional control on reading the disambiguating word (e.g., ‘violin’) when earlier information had not prompted a revision (e.g., ‘beautiful curved body’–no revise; ‘national concert hall’–neutral), but not when it had prompted one (e.g., ‘matching bow’–revise). Thus, in general, readers were able to detect a difference or mismatch between the new (disambiguating) word and their current situation model. Interestingly, the lower positivity found in the revise condition also signaled that readers activated the alternative interpretation when reading Sentence 4. Second, we also found larger positivity associated with the P3b in the neutral and no revise conditions compared to the revise condition. However, the P3b effect was qualified by an interaction with group. Critically, the high memory group manifested smaller positivity related to the P3b for the revise condition relative to the other two conditions, while the low memory group did not show significant differences between conditions. These working memory differences found for the P3b suggest that the two groups had engaged in different processing when they read Sentence 4 in the revise condition.

On the one hand, the P3a findings indicate that both high and low working memory readers perceived the disambiguating word as ‘new’ when prior text information had not prompted a revision. That is, all readers detected the mismatch between that word and their current situation model. Of note, this is convergent with the reading time data: all readers detected the mismatch between Sentence 4 and the introduction in the revise condition. Thus, both the behavioral and P3a results suggest that adult readers are able to evaluate their comprehension during reading. On the other hand, the P3b findings indicated that high and low working memory readers differed in their way to revise the situation model. The smaller positivity for the revise condition relative to the other two conditions manifested by the high memory group signaled that this group did not require additional memory processes to update their situation model on reading the disambiguating word, because they had already revised their situation model on reading sentence 4. Therefore, the high memory group had not only evaluated their comprehension detecting a mismatch (as indicated the longer reading times and the P3a) on reading Sentence 4, but also had revised their situation model, updating the final interpretation and significantly reducing activation of the previous interpretation. In contrast, the low memory group did not show significant differences between conditions for the P3b. This lack of differences indicated that the low memory group had not successfully revised their situation model on reading Sentence 4 in the revise condition because they had difficulties drawing activation away from the initial interpretation.

This interpretation is congruent with studies demonstrating that poor comprehenders with poor working memory capacity have problems in inhibiting irrelevant information (Cain, 2006; Carretti et al., 2004; Pimperton & Nation, 2010). It is also consistent with the Structure-Building framework (Gernsbacher, 1990, 1997), which argues that low working memory readers may experience problems with comprehension, because they fail to suppress no longer relevant information due to the generation of new substructures that reduce coherence of the situation model (e.g., Gernsbacher & Faust, 1991). In relation to this, a more specific proposal in the field of revision suggests that the information that is no longer relevant or outdated may exert an influence disrupting comprehension (e.g., Kendeou et al., 2013; O’Brien et al., 2010; Rapp & Kendeou, 2007, 2009). Taking all this evidence into account, our P3b results indicate that, in contrast to high working memory readers, low working memory readers have problems revising their situation model because they fail to inhibit the initial wrong interpretation. In addition, the presence of inferred information could make the revision of the situation model more difficult for low working memory readers, who may construct a more ‘imprecise’ or ‘inaccurate’ mental representation of the story. This is consistent with the literature of inference alteration, which suggests that memory processes are involved when an inference that has been previously activated must be replaced by a new one (e.g., Radvansky & Copeland, 2004). The process of integrating text information into a coherent situation model sheds light on this issue.

Integration process: N400

Similar to the P3b, the analysis of the N400 demonstrated that working memory capacity underpinned the ability to integrate information into the situation model. The high memory group showed larger negativity in the neutral and no revise conditions compared to the revise condition. This result indicated that they experienced difficulties with integrating the disambiguating word into their situation model when earlier text information had not required a revision; in contrast, they did not experience difficulties when a revision had been prompted by the text. Thus, high working memory readers were able to integrate the alternative interpretation into their situation model on reading Sentence 4 in the revise condition (supported by the reading time results). A different pattern was evidence for the low memory group, broadly similar to that found for the P3b: they showed no significant difference between the no revise and the revise condition, although larger negativity was found in the neutral condition compared to the revise condition.

The lack of difference between the no revise and the revise condition for the low working memory group strongly suggests that they had not successfully integrated the alternative interpretation into their memory representation on reading Sentence 4 in the revise condition. Both critical concepts (e.g., ‘guitar/violin’) shared similar semantic properties, which could potentially interfere and disrupt the construction of an accurate situation model. Moreover, the difference found between the neutral and the revise condition suggests that the low working memory group was able to integrate the alternative concept, reading the disambiguating word only when the neutral condition was presented in Sentence 4, because the nature of the neutral information (not related with the critical concept) did not cause semantic interference. These results are congruent with those studies showing the pervasive effect of semantic interference in relation to updating the contents of working memory (e.g., Szmalec, Verbruggen, Vandierendonck, & Kemps, 2011). Therefore, low working memory readers seem to be able to accurately integrate new information into the situation model under some circumstances, but they experience difficulties if that information semantically interferes with other information that is already encoded.

A compatible hypothesis that cannot be ruled out in this study is the possibility that the activation of the N400, and therefore the level of integration into the situation model, could be related to the degree of awareness with which readers detected the unexpected information. Unfortunately, we did not systematically ask participants if they had noticed anything ‘odd’ in the texts. However, there was a tendency for some participants to report this. Future research should include checks for awareness of inconsistencies in the debriefing and analyses of these to determine if differences in awareness exist between memory groups.

A comprehensive view

Our findings can be understood within Kendeou and O’Brien’s (2014) KReC framework. This proposes five key principles that are required for knowledge revision within a situation model during reading: encoding, passive activation, co-activation, integration, and competing activation. In relation to our paradigm, it means that once the previous interpretation (e.g., ‘guitar’) is encoded and passively activated from long-term memory, the presentation of information supporting the revised interpretation (e.g., ‘matching bow’) will cause the co-activation of both interpretations (e.g., ‘guitar/violin’). Furthermore, the revise information of Sentence 4 will lead to the integration of the alternative interpretation (e.g., ‘violin’) within the situation model, drawing activation away from the previous but now-incorrect interpretation and reducing the interference between the two (competing activation).

Applying the KReC logic to our results, the evaluation of mismatched information (prompted by Sentence 4), requires both the activation and integration of the two interpretations, whereas the revision of the situation model (prompted by the disambiguating word) involves the better integration and increase in activation of the new concept. Therefore, on the one hand, a failure to co-activate both interpretations and to integrate the new information will result in problems in evaluation, because no incompatibility will be detected. On the other hand, a failure to integrate, together with a deficient competing activation of the final concept, will cause difficulties in revision, because activation will remain in the now-incorrect interpretation, hampering stronger activation of the new concept. Our data suggest that low working memory readers have problems revising their situation model, because the activation of the previous interpretation continues to compete (by semantic interference) with the new concept. That is, these readers have problems strengthening the activation of the final concept because they fail to inhibit the wrong interpretation. We believe that this is a promising framework for the future study of the inferential revision process.

Finally, to better understand our results it is important to consider the processes that were evaluated in the two working memory tasks and how they relate to the reading comprehension processes studied here. The reading span task was used as an index of the ability to actively maintain information, making it readily retrievable (Daneman & Carpenter, 1980), while the memory updating task was selected to measure the suppression of no longer relevant information (e.g., Carretti et al., 2010). Our situation model revision task tapped both of these aspects of working memory: it required the activation and maintenance of the previous interpretation as well as the inhibition of that interpretation when a more plausible inference was apparent. This viewpoint is not incompatible with Engle’s (2002) perspective that working memory reflects the ability to allocate attentional resources when the task involves interference control to maintain or suppress information. We believe that the ability to actively maintain and suppress no-longer-relevant information, combined with the need to control interference, are both crucial in explaining how individual differences in working memory are related to the inferential revision process.

Conclusions

To our knowledge, this is the first study to report ERP data associated with the inferential revision of readers’ situation models. We have proposed that the context-updating theory is a promising framework to dissociate the comprehension monitoring processes of evaluation and revision. Through this, we have identified where in the reading process readers with poor working memory have difficulties revising their situation model. Our ERP data provide evidence that strongly suggest that low working memory readers are able to evaluate their comprehension and detect coherence breaks in the text (evidenced by the P3a), but that they have difficulties revising their memory representation because they fail to inhibit an initial wrong interpretation (P3b) and integrate new information to ensure an accurate situation model (N400).