In addition to looking at the participants’ answers to questions following the experimental texts, we addressed the research questions by looking at the pattern of reading times as determined by means of the self-paced (sentence-by-sentence) moving window method (Just, Carpenter, & Woolley, 1982). In line with the findings in experienced adult readers (e.g. Huitema et al., 1993; Long & Chong, 2001), we hypothesized that in the global and local condition, the good comprehenders would spend more time reading the target sentence when it was inconsistent with the earlier-described character than when the same sentence was consistent with the character description. It is assumed that these longer reading times reflect, at least in part, the processes involved in the detection and resolution of consistency violations (e.g. Albrecht & O’Brien, 1993).
For the poor comprehenders, two possibilities were hypothesized. The first hypothesis departs from the assumption that poor comprehenders are impaired in their ability to construct a richly elaborated situation model (and thus fail to incorporate the character information) but that they do try to update the model (or working memory system) when the information which they did build-in is contradicted and in need of revision. Thus, under the first hypothesis, we expected poor comprehenders to read inconsistent actions more slowly than consistent ones in the local condition (in which the protagonists’ actions can be viewed in the light of their character as the character information is still active in working memory) but not, or much less so, in the global condition (in which the target actions may not be evaluated against any character information as this information is likely to be lost from memory). The starting assumption underlying the second possible hypothesis is that poor comprehenders are especially poor at updating the situation model. This implies that even when poor comprehenders are able to detect an inconsistency they are not expected to adjust their reading in an attempt to resolve it. Hence, under the second hypothesis, no differences in reading times were anticipated between inconsistent and consistent actions in both the global and local condition.
Method
Participants
The participants were 31 children (18 boys/13 girls) with high reading comprehension levels (good comprehension group) and 26 children (16 boys/10 girls) with low reading comprehension levels (poor comprehension group). The groups were matched on age (M = 11.3, SD = .6 vs. M = 11.5, SD = .6, respectively, t(55) = 1.32, ns) and decoding skill (M = 81.52, SD = 12.50 vs. M = 77.19, SD = 12.42, respectively, t(55) = 1.31, ns). Decoding skill was assessed by the EMT, a standardized Dutch word reading test (Brus & Voeten, 1999). The EMT showed that the word decoding skills in all participating children were around their grade level and thus more or less automatized.
The children attended either Grade 5 or 6 at a regular primary school in the Netherlands. They were native speakers of Dutch and had normal or corrected-to-normal vision. Exclusion criteria were any neurological disorder and IQ less than 85 (IQ was estimated on the basis of two subtests (Vocabulary and Block Design) of the Dutch version of the Wechsler Intelligence Scale for Children—Revised (van Haasen, 1986)). For all participating children informed consent was obtained from their parents or care-takers.
Assessment of reading comprehension level
Children were classified as good or poor comprehenders based on their performance on the (Grade 5 and Grade 6 versions of the) standardized Test for Reading Comprehension of the Dutch National Institute for Educational Measurement (CITO) (“Toets Begrijpend Lezen”, Staphorsius & Krom, 1998). This test is part of the standard Dutch CITO pupil monitoring system and is designed to determine general reading comprehension level in primary school children.
To classify children as good or poor comprehenders, we compared the test scores with their age-appropriate norm scores. Children whose scores were among the highest 25% of the norm scores were classified as good comprehenders (M = 70.71, SD = 11.24), children scoring among the lowest 25% were classified as poor comprehenders (M = 35.96, SD = 12.49). The educational age-norms for average reading comprehension level were obtained in extensive standardization studies on reading in the Dutch population of primary school children (Staphorsius & Krom, 1998).
Materials and design
The experimental texts had the following structure. Each text began with a three- or four-sentence paragraph introducing a person. This was followed by a three- or four-sentence elaboration paragraph in which the characteristics (or preferences) of the person are described. These characteristics were either consistent or inconsistent with an action performed by the person later in the text. The elaboration paragraph was followed by a filler paragraph, which contained either one sentence (local condition: mean (M) number of words = 11.4, SD = 1.9) or five to seven sentences (global condition: M = 77.8, SD = 5.1). The text continued with the target sentence in which the person performs the action that is either consistent or inconsistent with his character described in the elaboration paragraph. For example, the target sentence Peter ordered a cheeseburger and fries is consistent with his description as a fast-food addict but inconsistent with his description as a vegetarian (example taken from Albrecht & O’Brien (1993) and Long & Chong (2001)). In the local condition, the character elaboration (in this case, Peter’s food preferences) was still active in working memory while reading the target sentence as the character description and the target action were only separated by one intervening sentence. On the other hand, in the global condition, the long filler paragraph ensured that the character description was eliminated from memory (see Albrecht & O’Brien, 1993; Long & Chong, 2001). Importantly, all filler paragraphs were content-neutral with regard to the target action. The target sentence was followed by a second critical sentence (post-target sentence) so as to detect possible spillover effects. Each text ended with a brief closing paragraph.
Thus, in total, there were 4 within-subjects conditions formed by the crossing of 2 factors: location (local vs. global) and consistency (consistent vs. inconsistent). Each subject was presented with 8 experimental texts, 2 in each condition. The stimuli were arranged in 4 material sets, each containing the 8 texts. Each set was presented to 8 good comprehenders (with the exception of one set which was presented to 7 good comprehenders) and 7 poor comprehenders (with the exception of two sets which were presented to 6 poor comprehenders).
In order to ensure full combination of conditions and materials (and control for text effects), the different versions of each text were counterbalanced across the material sets by means of a 4 × 4 Latin square design. Thus, across sets and across participants, each story occurred equally often in the local/consistent, local/inconsistent, global/consistent and global/inconsistent version. The order in which the texts were presented in each set was pseudorandomized.
Self-paced moving window method
Reading times were collected by means of the self-paced moving window method (Just et al., 1982). In this method, subjects are presented with passages of text on a computer screen in such a way that words are masked by X’s. A window reveals one word, or one sentence, at a time to the reader. When the reader is finished comprehending a word/sentence, they press a key to move the window to the next word/sentence. Assuming that information is processed as soon as it is perceived (Just & Carpenter, 1980), the key pressing latencies (i.e. reading times) in the self-paced moving window task reflect the processing of the word/sentence contained in the window.
In this experiment, we made use of the self-paced sentence-by-sentence procedure. Reading time on the target and post-target sentence was thus defined as beginning when the sentence was first revealed and lasting until the next key press.
Procedure
Before the start of the inconsistency detection task, children were informed that the study was designed to examine reading comprehension of text displayed on a screen. They were instructed to read at their normal rate and to comprehend what they were reading as well as they could. As for the text presentation procedure, seven or eight masked sentences were presented on one screen. Each sentence consisted of either one or two lines of text. The target and post-target sentences always fit on one line of the screen. At the start, the moving window was placed on the first sentence of the first text. Subjects used the down-arrow key to see each successive sentence in a text. Pressing the down-arrow key on the last sentence on a screen caused the window to move to the first sentence on the next screen. The last sentence of each text revealed the words “NEXT TEXT” indicating the end of the text and the beginning of the next one. The first experimental sentence on the first screen was preceded by four test sentences.
The experimental texts were interspersed with short filler texts to prevent the subjects from becoming aware of the purpose of the experiment. After each text (i.e. each time the moving window revealed the words “NEXT TEXT”), a Yes or No comprehension question was asked. The question pertained to the situational content of the text just read and was asked orally by the experimenter. After answering the question, the subjects were instructed to directly jump to the next text (by pressing the down-arrow key). To prevent them from becoming tired, participants were given a break halfway through the experiment.
The children were tested at school. They completed the inconsistency detection task, sentence span task (see below), and word reading test in a silent room and the test for reading comprehension in the classroom (whole-class test taking). The experimenter was always present. In total, the experiment lasted approximately 1 h and 45 min.
Working memory capacity measure
To assess working memory capacity, an adaptation of the Sentence Span Task (Daneman & Carpenter, 1980; Swanson, 1994; Swanson, Cochran, & Ewers, 1989) was administered. In this task, participants were asked to read aloud groups of unrelated sentences (7–10 words in length). After reading, their task was to recall the last words of the sentences in the right order, and to answer a comprehension question about one of the sentences. The purpose of the question was to make sure that children read for comprehension and did not merely try to remember the target words. The number of sentences in the groups gradually increased. Working memory capacity was defined as the largest group of end words recalled (with the additional requirement that the comprehension question was answered correctly). The sentence span task measures verbal working memory capacity and predicts performance on reading tasks as well as other related tasks (Daneman & Green, 1986; Masson & Miller, 1983).
Results
Reading time on the target sentence
On the reading times, overall 2 × 2 × 2 analyses of variance (ANOVA) on the subject (F
1) and item (i.e. text) (F
2) means were conducted with consistency and location as within-subject variables and with comprehension group as the between-subject variable. In Fig. 1, reading time on the target sentence (in milliseconds) is presented as a function of consistency (consistent vs. inconsistent), location (local vs. global) and comprehension group (good vs. poor). The results of the ANOVA showed that, on the subject means, reading times were faster in the global than local condition (F
1(1,55) = 5.40, p < .05, η
2p
= .09; F
2(1,2) = .45, ns), and that good comprehenders tended to read the target sentence faster than poor comprehenders (F
1(1,55) = 3.11, p < .1, η
2p
= .05; F
2(1,2) = 9.39, p < .1, η
2p
= .82; 0.01 is considered a small partial eta-squared effect size, 0.06 is considered a medium effect, and 0.14 is considered a large effect Stevens, 2002). More important here is that the effect of consistency varied as a function of the location of the target sentence in poor but not in good comprehenders. In the local condition, both good and poor comprehenders read the target sentences more slowly when they were inconsistent with the character description than when they were consistent with it (Consistency × Group: F
1(1,55) = .35, ns; F
2(1,2) = .43, ns). In the global condition, on the other hand, the inconsistency effect was present in the good but not in the poor comprehenders (who read inconsistent sentences as fast as consistent ones; Consistency × Group interaction: F
1(1,55) = 4.40, p < .05, η
2p
= .07; F
2(1,2) = 9.90, p < .1, η
2p
= .83). In other words, poor comprehenders slowed down their reading on inconsistent sentences when the target action directly followed the character elaboration, but not when the target sentence and character elaboration were interspersed with a long filler paragraph (Consistency × Location: (F
1(1,55) = 4.44, p < .05, η
2p
= .15; F
2(1,2) = 14.33, p < .1, η
2p
= .88). The above pattern of results was evident in the significant (F
1) and marginally significant (F
2) Consistency × Location × Group interaction (F (1, 55) = 4.51, p < .05, η
2p
= .08; F
2(1, 2) = 12.99, p < .1, η
2p
= .87). The three-way interaction disappeared, however, when working memory capacity was entered as a covariate (F
1(1,54) = 2.37, p = .13).
Reading time on the post-target sentence
In Table 1, reading time on the post-target sentence (in milliseconds) is presented as a function of consistency (consistent vs. inconsistent), location (local vs. global) and comprehension group (good vs. poor). The analyses on the reading times on the sentence immediately following the target sentence showed that good comprehenders had faster reading times than poor comprehenders (F
1(1,55) = 7.79, p < .01, η
2p
= .12; F
2(1,2) = 15.99, p = .057, η
2p
= .89). In addition, both groups exhibited an effect of location (F
1(1,55) = 4.16, p < .05, η
2p
= .07; F
2(1,2) = 77.94, p < .05, η
2p
= .98), signifying that they read the post-target sentence slower in the local then global condition. However, analyses on the post-target sentence reading times failed to show any significant differences between consistent and inconsistent sentences.
Table 1 Post-target sentence reading times (in milliseconds) as a function of consistency (consistent vs. inconsistent), location (local vs. global) and comprehension group (good vs. poor) (+SE)
Mixed-effects modeling
In addition to ANOVA, we also analyzed the data using mixed-effects modeling with the maximum likelihood method to calculate parameter estimates. Mixed-effects modeling is a statistical technique for data repeatedly observed from the same subjects and/or materials. It is gaining popularity over the conventional methods because differences between individuals and differences between materials are modeled by means of (crossed) random effects, resulting in, among other things, increased power, a better account of heterogeneity of variance, and a better use of available data (e.g. Baayen, Davidson, & Bates, 2008; Bicknell, Elman, Hare, McRae, & Kutas, 2010; Kliegl, 2007, for an application of mixed-effects modeling in reading comprehension research using, respectively, self-paced reading and eye tracking).
In the present analyses, subjects and materials (i.e. experimental sentences) were thus treated as random effects, and consistency, location and group as fixed effects. The fixed effects were coded as follows: Group (good = 1, poor = 0), Consistency (consistent = 1, inconsistent = 0) and Location (local = 1, global = 0). To control for word-related effects, reading times were normalized by the number of words in the sentence, and the sentences’ average log word frequency (obtained from CELEX, the database from the Dutch Centre for Lexical Information Baayen, Piepenbrock, & Gulikers, 1995) was added as covariate to exclude a word frequency confound. We used the logarithm of word frequency because reading times are linearly related to the logarithm of word frequency, not to raw word frequencies (Haberlandt & Graesser, 1985; Just & Carpenter, 1987).
Results confirmed the findings reported above. The analysis on the target sentence reading times showed main effects of Group (F(1, 57) = 4.42, p < .05), Consistency (F(1, 399) = 6.41, p < .05) and Location (F(1,399) = 6.83, p < .01) (The F values are produced by Type III Tests of Fixed Effects, and since the design is balanced, they are expressed as the ratio of the appropriate sums of squares). More importantly, the critical Consistency × Location × Group interaction was significant in predicting the reading times on the target sentence (F(1,399) = 9.55, p < .005) (model fit: −2 log likelihood = 6,225.77). The theoretically meaningful parameter estimates for Consistency (PE = 68.89, SE = 39.59) and Consistency × Location × Group (PE = 234.67, SE = 75.92) were, respectively, marginally significant (p < .1) and significant (p < .005). Results for the post target sentence indicated a main effect of Group (F(1,56.87) = 8.02, p < .01) and Location (F(1,397.74) = 5.40, p < .05), but no significant effects of Consistency, neither as a main effect, nor through its interactions with the other factors (model fit: −2 log likelihood = 8,078.72). Parameter estimates for Consistency (PE = 51.81, SE = 298.39) and Consistency × Location × Group (PE = 254.31, SE = 572.22) were also not significant.
Comprehension questions
In Fig. 2, the average number of correct answers to the comprehension questions is presented as a function of consistency (consistent vs. inconsistent), location (local vs. global) and comprehension group (good vs. poor). As can be seen from Fig. 2, good comprehenders answered more questions correctly than poor comprehenders (F(1,55) = 11.06, p < .005, η
2p
= .17). Of more significance is that the interaction between group and consistency depended on location, yielding a three-way interaction between those factors (F(1,55) = 4.97, p < .05, η
2p
= .08). In the global condition, poor comprehenders gave more incorrect answers to inconsistent than consistent texts. Good comprehenders did not show this effect of consistency (Consistency × Group: F(1,55) = 8.01, p < .01, η
2p
= .13), nor did both groups in the local condition (Consistency × Group: F(1,55) = .00, ns). The three-way interaction remained marginally significant after controlling for working memory (F(1,54) = 3.22, p < .1, η
2p
= .06).
Discussion
Experiment 1 investigated comprehension monitoring in 10–12 years old children differing in general reading comprehension skill. The children’s reading times were measured as they read narrative texts in which an action of the protagonist was consistent or inconsistent with a description of the protagonist’s character given earlier. The character description and action were adjacent (local condition) or separated by a long filler paragraph (global condition). The goal was to find out whether poor comprehenders mainly differ from good comprehenders in situation model construction or updating. In this study, situation model construction specifically refers to the richness of the model, which here concerns the question of whether or not a reader incorporated the character information. Situation model updating refers to the adaptability and solution-readiness of the model in case of a consistency violation.
For the good comprehenders, we hypothesized that they would slow down their reading on inconsistent actions when compared to consistent ones in both the local and global condition. The target sentence reading time analysis confirmed this hypothesis at both the subject- and item-level, and in both the ANOVA and mixed model analysis, indicating that good comprehenders detected the inconsistencies and made an attempt to resolve them (e.g. Albrecht & O’Brien, 1993). Probably, the extra time good comprehenders spent on an inconsistent target sentence reflected their effort to double-check the inconsistency and/or think up possible resolutions. It is important to note that from the inconsistency effect they displayed in the global condition it can be inferred that good comprehenders must have represented the character elaboration in the situation model they constructed from the text. Otherwise, the character information would have been lost from their working memory when they read the target sentence as a consequence of which they would have missed its inconsistency (thereby leaving their above-mentioned updating skills unused).
The results for the good comprehenders are in line with the results obtained in experienced adult readers (e.g. Huitema et al., 1993; Long & Chong, 2001). Yet, this study is one of the few demonstrating the current situation model construction and updating skills in children. Apparently, skilled 10–12 years old readers who have automatized their lower-level word decoding skills (as was the case here) have freed up enough processing capacity for the higher-level processes to enhance reading comprehension. In particular, the present study showed that they can carry out reading comprehension strategies to aid the construction and updating of a coherent and richly-connected situation model of a text.
In poor comprehenders, we found that reading times were slower on inconsistent than consistent target sentences in the local but not the global condition. This pattern of reading times, found at both the subject- and item-level, and in both the ANOVA and mixed model analysis, provides clear support for one of the two proposed hypotheses. This hypothesis says that poor comprehenders find difficulty in constructing a rich situation model. That is, they tend to leave situation-relevant information out of the model, including information that could be of use to them in interpreting later information in the text. In the present inconsistency detection task, this implies that poor comprehenders presumably failed to represent the character of the protagonist in their situation model in most stories. At least, this explains why they did not show any effect of inconsistency in the global condition. The inconsistencies simply went unnoticed as the information that was contradicted was neither present in their (short-term) working memory nor (long-term) situation model and therefore was no longer accessible.
Here, it is important to realize that the inconsistency effect which is exhibited by poor comprehenders in the local condition rules out an explanation in terms of impaired updating ability. Under the assumption that in this condition the character elaboration is still active in working memory, this finding indicates that when poor comprehenders understand that new information contradicts previous information they do make an attempt to resolve the inconsistency and restore comprehension in their current (working memory) representation of the text. As in good comprehenders, the extra time poor comprehenders spend reading an inconsistent action statement probably reflected their effort to double-check the inconsistency and/or elaborate possible resolutions. It should be mentioned, however, that the effects of inconsistency were limited to the target sentence reading times in that they did not extend into the spillover sentence immediately following the action sentence. This finding, obtained in both the subject and item analyses, and in both the ANOVA and mixed model analysis, is intriguing but difficult to interpret, especially since similar studies as the present one have previously demonstrated such a spillover effect (Albrecht & O’Brien, 1993; Long & Chong, 2001; Poynor & Morris, 2003).
The above-described differences in situation model construction and updating between good and poor comprehenders were not only evident in the reading time data but also in the data on the comprehension questions (which were assumed to at least partially tap the situation model representation that the reader constructed from the text). It was found that the negative effect of consistency violations on comprehension were restricted to the poor comprehenders’ answers given in the global condition. As was the case with the reading time data, this can be accounted for by assuming that poor comprehenders did not build character information into their situation model as a consequence of which they had a worse chance of noticing global inconsistencies. So, the comprehension question data coincide with the reading time data, and together, they indicate that poor comprehenders found it especially hard to build and maintain a coherent and integrated situation model from globally inconsistent texts.
This leaves us with the question of how to explain the influence of verbal working memory capacity. As was demonstrated in the analysis of covariance, the interacting effects of consistency, location and group disappeared after controlling for the participants’ sentence span. The most dominant conceptions of both verbal and non-verbal working memory assume that working memory consists of a storage component, in which information is maintained during processing, and a processing component, which coordinates the mental activities that are required by the task at hand (Baddeley, 1986; Shah & Miyake, 1996). In the present sentence span task, the storage component involved reading aloud a small set of sentences and the processing component involved remembering the last word of each sentence.
It is generally assumed that the relationship between verbal working memory tasks, such as the present sentence span task, and reading comprehension is mediated by the processing component of verbal working memory, rather than by the storage component (e.g. Daneman, 1987; Daneman & Tardif, 1987). According to this assumption, the lower verbal memory span of poor comprehenders demonstrated in this and other studies is thus a direct reflection of their weak language comprehension skills. Interestingly, Nation, Adams, Bowyer-Crane and Snowling (1999) showed that poor comprehenders had lower spans only on memory tasks that called upon semantic processing skills. On a series of other non-verbal/non-linguistic tasks, including one tapping spatial working memory, poor comprehenders were not impaired. From this, Nation et al. (1999) concluded that the poor comprehenders’ difficulties with reading and language skills can not be related to a general processing capacity weakness. Instead, they came to the conclusion that “…the memory difficulties associated with poor reading comprehension are specific to the verbal domain and are a concomitant of language impairment, rather than a cause of reading comprehension failure” (p. 139). Probably, this may account for the finding that the interaction effects of consistency and location depended on the poor and good comprehenders’ verbal working memory capacity. That is to say, the effects were weakened supposedly because the covariate represented the same conceptual measure (i.e. semantic processing skill) as the patterns of the dependent variables.