Introduction

In recent years a number of studies have been reported that examine eye-movement behaviour during silent sentence reading in children compared to adults (see Blythe & Joseph, 2011, and Blythe, 2014, for reviews); however, this research has predominantly focused on foveal reading processes. That is, examining word-identification processes for the directly fixated word (n). In contrast, there is a paucity of research that directly compares parafoveal reading processes in adults and children, examining how identification of the upcoming word (n+1) occurs and which factors can affect such processing.

The use of eye-movement recordings in order to study reading is a dominant research method for skilled adults, providing a moment-to-moment index of the reader’s cognitive processing of text (e.g., Rayner, 2009). Critically, such research has shown that, during a fixation on n, adults both process n and also begin to pre-process n+1. Subsequently, when n+1 is directly fixated, reading times are faster due to the pre-processing that has already occurred (see Schotter, Angele, & Rayner, 2012, for a review). This is referred to as parafoveal pre-processing, and can be considered a hallmark of skilled, fluent adult reading (Rayner, Liversedge, & White, 2006a). The importance of parafoveal pre-processing has been shown through a number of studies that have used gaze-contingent paradigms, where the stimulus changes as the reader progresses through the sentence dependent on the location of their fixation (e.g., the boundary paradigm; Rayner, 1975; see Fig. 1). Specifically, gaze-contingent techniques can be used to deny readers the opportunity for parafoveal pre-processing. It is quite clear that skilled adult readers depend upon parafoveal pre-processing for rapid, fluent sentence reading.

Fig. 1
figure 1

Example of the boundary paradigm (Rayner, 1975). Fixation locations are marked by the asterisk under the sentence. When a sentence is first presented on the screen, the target word is replaced with a preview letter string. When the participant is fixating the pre-target word (n-1; clever in this example), word n (e.g., sister) is unavailable for pre-processing. An invisible boundary is placed immediately in front of the target word (marked here by a vertical line for demonstration, though this is not visible on the participant's screen during the experiment). When the reader makes a saccade across the invisible boundary, the preview letter string (e.g., romlun) is replaced with the correctly spelled word and the reader is typically unaware that any change has occurred. Two control conditions are typically included – an identity condition, where the preview is identical to the target word, and a completely unrelated preview condition, where all letters are replaced with stimulus strings that do not provide any useful information about the upcoming word (e.g., romlun, as shown here). Reading times are typically shortest in the identity condition, as the reader has benefitted from undisrupted parafoveal pre-processing of the target word. Conversely, reading times are expected to be longest in the unrelated preview condition, as the reader has been unable to extract any information that might facilitate lexical identification. Experimental conditions then manipulate/preserve features of the upcoming word as per the manipulations of interest in the study. Reduced reading times on a target word observed after a correct (identity) preview, compared to an incorrect preview (i.e., the experimental conditions and the unrelated preview condition), is known as preview benefit

In order to gain insight into how beginner readers progress to be skilled readers, it is crucial to understand how this skill, so pivotal to skilled adult reading, develops. Through the boundary paradigm, by manipulating certain characteristics of the relationship between the preview letter string and the correct target word, it is possible to determine the type of information that is pre-processed in the parafovea. Adults pre-process orthography (a word’s printed form), for example displaying faster reading times after an orthographically similar preview is available compared to an orthographically dissimilar preview (e.g., cahc vs. picz as preview for cake; Balota, Pollatsek, & Rayner, 1985). The external letters of a word are particularly important for skilled adult readers in both parafoveal pre-processing (Johnson, Perea, & Rayner, 2007) and during subsequent direct fixation (Johnson & Eisler, 2012). Manipulations that affect the first or final letter of a word have a disproportionately large cost to reading times, relative to manipulations of internal letters, with the first letter seeming to play a particularly important role (e.g., Briihl & Inhoff, 1995; Inhoff, 1989a,b; White, Johnson, Liversedge, & Rayner, 2008).

Little research, however, investigating children’s parafoveal pre-processing in alphabetic languages has been undertaken.Footnote 1 One study has examined the first-letter advantage in parafoveal preview for children compared to adults. Pagán, Blythe, and Liversedge (2016) examined 8- to 9-year-old English children's orthographic pre-processing of the first three letters of an upcoming word. Similar to adults in terms of both the magnitude and the time course of their pre-processing, they also found that children showed a beginning bigram (the first two letters of a word) bias. This study only manipulated the first three letters of words in parafoveal preview, though, and orthographic pre-processing of the entire word form was not examined. Johnson, Oehrlein, and Roche (2018) have also provided evidence for the importance of first letters to children’s pre-processing: faster reading times were found when the first two letters of target words were maintained in preview – orthographically similar condition, compared to when all letters were substituted in preview – orthographically dissimilar condition (e.g., apydo vs. egydo as previews for apple). Thus, the beginning letters clearly play an important role in both adults’ and children’s parafoveal pre-processing, but, whilst Johnson et al.’s (2018) study might suggest that children extract orthography from the entire word in preview, whether children show external letter advantages for both first and final letters or whether this bias is limited to the first letters of a word is unknown.

In the present study two key questions were addressed: (1) whether children are able to pre-process whole target words in the parafovea; and (2) whether external or internal letters are more facilitative to parafoveal pre-processing. To examine these questions the boundary paradigm (Rayner, 1975) was used. The locations of letter substitutions within a target word were manipulated in preview to examine the spatial extent of orthographic pre-processing in children compared to adults – letters were substituted in preview at the beginning, middle, or end of the target words. Research using other experimental paradigms has indicated that children do pre-process some information up to 11 character spaces away from the point of fixation, although those studies did not show which lexical characteristics were processed (e.g., word length, word shape, letter identity, etc.; Häikiö, Bertram, Hyönä, & Niemi, 2009; Rayner, 1986; Sperlich, Schad, & Laubrock, 2015). On this basis, we predicted that both adults and children would be sensitive to letter substitutions at the end of the target word as well as at the beginning. We also expected to show a higher cost to both adults' and children's reading from manipulations that involved the first letters of a word compared to those that involved internal letters within a word (Pagán et al., 2016; White et al., 2008).

Method

Participants

Forty-two adults (Mage = 22.24 years) and 42 children (aged 8–9 years; Mage = 8.76 years) participated in the eye-tracking experiment. See Table 1 for a summary of group characteristics. All had normal or corrected-to-normal vision, and were native speakers of English with no known reading difficulties. This was confirmed by the reading subtests of the Wechsler Individual Achievement Test II UK (WIAT-II UK; Wechsler, 2005); all participants were within the expected range (adults’ composite standardised score range: 99–135; children’s composite standardised score range: 104–123; see also Table 1).

Table 1 Summary of group characteristics

Materials and design

We used the stimuli developed by Pagán et al. (2016), which consisted of 26 target words in sentence frames. These were supplemented by 34 additional target words and sentence frames that we created. Target words were either nouns or adjectives, and were bisyllabic with a CVCCVC structure, with the syllable boundary falling between the second and third consonants (see Table 2 for target-word properties). All materials were pre-screened for both the difficulty of the sentences and the predictability of the target words within each sentence, to confirm that the materials were suitable for use with our target age range. For the additional 34 target words, two possible sentence frames were created. Eighty children (8- to 9-year-olds; none of whom took part in the eye-tracking experiment) rated these sentences on a scale of 1 (easy to understand) to 7 (difficult to understand). They also completed a sentence-constraint rating (predictability) task for the 94 sentences (as Pagán et al., 2016, did not pre-screen for predictability), where the sentence frame was presented with a blank space in the target location and the children were asked to fill in the word that they thought best completed the sentence. The results from the pre-screening are shown in Table 2, and the final stimulus set was selected to ensure that the sentences were easy to understand for our target age range, and that the target word in each sentence was not highly predictable (to minimise skipping). For each of the new target words, one sentence frame was selected for use in the eye-movement experiment on the basis of this pre-screening. Six target words and their associated sentence frames were dropped (one from Pagán et al., 2016). The final stimulus set consisted of 54 experimental sentences.

Table 2 Linguistic properties of the target words and sentence frames

The boundary paradigm (Rayner, 1975) was used. Using this paradigm, the text displayed on the screen changes contingent on where the reader is fixating (see Fig. 1). A preview letter string occupies the target word location at trial onset, but when the reader makes a saccade to directly fixate the target word (crossing an invisible boundary), the preview letter string changes to the correct target word. In the current experiment, six parafoveal preview conditions (or letter strings) were generated for each target word (see Appendix 1). There were two control conditions: an identity condition, where the preview was identical to the target word (123456; e.g., sistersister), and an unrelated condition, where only the letter shapes of the target word were maintained in preview (dddddd; e.g., romlunsister). There were four other experimental conditions which each involved the substitution of three of the letters of the target words in preview: the beginning three letters of each word (ddd456); internal letters 2, 3, and 4 (1ddd56); internal letters 3, 4, and 5 (12ddd6); and the end three letters of each word (123ddd). Both the beginning and end substitution conditions were within one syllable, whilst the middle substitution conditions affected both syllables. Both CVCCVC structure and word shape were maintained in these substitutions.

The 54 experimental sentences were counterbalanced across six lists using a Latin-square design (nine sentences per condition). The sentences occupied one line on the screen (maximum = 77 characters; M = 60 characters) and each target word was placed near the middle of the sentence.

Apparatus and procedure

An EyeLink 1000 eye-tracker recorded right eye movements (SR Research). Forehead and chin rests were used to minimise head movements. The sentences were presented in 14-pt black Courier New font on the grey background of a 21-in. CRT monitor, with a refresh rate of 120 Hz, at a 60-cm viewing distance; one character subtended .34° of visual angle. Participants were instructed to read normally and for comprehension. Once participants had finished reading a sentence, they pressed a response key, and one-third of the sentences were replaced by a comprehension question, to which the participants responded. After completion of the experiment, participants were asked whether they had noticed anything strange about the appearance of the sentences in the experiment: detecting a display change can affect fixation times (e.g., White, Rayner, & Liversedge, 2005). Four adult participants reported noticing something unusual about the sentences, so their data were excluded from the analyses. The whole experiment lasted about 45 min per participant.

Results

All participants scored at least 78% correct on the comprehension questions (adults: M = 98%; children: M = 92%). The data were trimmed using the clean function in DataViewer (SR Research).Footnote 2 In total 1,886 fixations were merged or deleted (2.36% of the dataset; 693 adult fixations, and 1,193 child fixations).

Reading-time data on the target word in each sentence were analysed. Before analysing the local dependent measures, the data were further cleaned: trials in which the boundary change occurred early during a fixation on the pre-target word, and those that occurred late when the display change was not completed until more than 15 ms after onset of fixation on the target word were excluded from the analyses (230 adult trials – 10.14% of the adult trials, and 314 children’s trials – 13.84% of the children’s trials).Footnote 3 Prior to analysis, reading-time data were log transformed.

Data were analysed using linear mixed effects (lme) models, using the lmer function from the lme4 package (Bates, Mächler, Bolker, & Walker, 2015) within the R environment for Statistical Computing (R Core Team, 2020). We focus here upon three dependent measures: first fixation duration (the duration of the initial first-pass fixation on a word, regardless of how many fixations the word received), gaze duration (the sum of all fixations on the word before the eyes left it for the first time), and total reading time (the sum of all fixations made on the target word) (see Table 3). Participants and items were entered as crossed random effects. A full random structure was initially specified for participants and items, to avoid being anti-conservative (Barr, Levy, Scheepers, & Tily, 2013); the random structure was trimmed until the models converged. Effects were considered significant when, initially, |t| > +/-1.96.

Table 3 Mean and standard deviation (in parentheses) reading times on the target word in each condition

In all of the lme models there were significant group differences: children displayed significantly longer first fixations, gaze durations, and total reading times than the adults (see Table 3). We focus upon significant effects of the experimental manipulations, and any interactions with participant group.Footnote 4

Model 1

This model used the identity control condition (123456) as a baseline, with each of the non-word preview conditions compared to it, thus examining the potential costs associated with substitutions being present in the parafovea, and the extent to which participants were gaining preview benefit. As can be seen from Tables 3 and 4, for all of the non-word preview conditions the adults experienced a significant cost relative to the identity condition – their foveal word identification was facilitated by obtaining a processing benefit from the correct parafoveal preview. The presence of significant interactions with participant group suggests that adults and children differed in their processing of letter substitutions in preview, in the earlier measure of first fixation duration. In contrast to the adults, children showed little increase in reading times for any of the substitution conditions, with the exception of ddd456, demonstrating a lack of preview benefit. Clearly, both adults and children, though, experienced a cost to early measures of lexical processing when parafoveal pre-processing of the first letter of the word was disrupted. Substitutions of other letters in the word disrupted very early lexical processing for adults but not children, who showed delayed sensitivity to substitutions of all except the first letter of the word. Certainly, by the time the reader had engaged in second-pass reading on a word, both adults and children showed a cost to reading times from substitutions in all letter positions in preview, demonstrating comparable preview benefit effects.Footnote 5

Table 4 Output from Model 1 for first-fixation duration, gaze duration and total reading time

Model 2

This model collapsed ddd456 and 123ddd together, and 1ddd56 and 12ddd6 together, in order to compare external to internal letter manipulations. The contr.sdif function (package MASS) was used to set up the factors. Then, contrasts were run to compare ddd456 to 123ddd for adults and children separately. As shown in Table 5, and Fig. 2, the internal letter substitution conditions led to significantly faster reading times than the external letter substitution conditions, for both adults and children. Also, the contrasts revealed that, in first fixation duration, the children were showing a first-letter bias. Children’s reading times were significantly slower in ddd456 than 123ddd in this very early measure of processing (see Table 3). Interestingly, note that in gaze duration and total reading time this effect of external letter substitutions seemed mainly to be driven by the end letter (123ddd; see Table 3).

Table 5 Output from Model 2, and contrasts, for first-fixation duration, gaze duration, and total reading time
Fig. 2
figure 2

Mean reading times for the collapsed external letter substitution conditions (ddd456 and 123ddd) and the internal letter substitution conditions (1ddd56 and 12ddd6), for both adults and children

Controlling for multiple comparisons

Given that Models 1 and 2 contain a number of comparisons across the five experimental conditions, we ran these models again using the glht function (package multcomp) to adjust p values and control for the multiple comparisons being made within each model (Hothorn, Bretz, & Westfall, 2008).Footnote 6 For the majority of effects, this did not change the pattern of significance; we report here those instances where the correction did make a difference. First, within first fixation duration in Model 1, the interaction term between children and 12ddd6 became non-significant (and marginally significant between children and dddddd), suggesting that the children’s parafoveal pre-processing in these conditions was not significantly different (or only marginally so) to that of the adults.Footnote 7 Second, the interaction between children and dddddd became non-significant in total reading time; here, the children’s processing was consistent with that of the adults (see also Table B2). Third, within Model 2, in gaze duration, the main effect of external compared to internal letter substitutions in preview became marginally significant.

Discussion

The present study investigated parafoveal pre-processing in English children and adults during silent sentence reading, specifically comparing pre-processing of beginning, internal and end letters. As expected, the children did pre-process the whole target word in the parafovea. Like adults, they displayed a cost from 123ddd substitutions, demonstrating that they were sensitive to substitutions of the final letter of the target words (albeit a slightly delayed effect, i.e., present in gaze duration). This indicates that children’s parafoveal pre-processing (of n+1) was not constrained by visual acuity limitations. If pre-processing was constrained by visual acuity, 123ddd should have been the least disruptive condition, as those substitutions were furthest away from the point of fixation. Instead, the significant cost associated with end-letter substitutions clearly demonstrates that children's parafoveal pre-processing extended over the orthographic form of the whole word (six letters, in this case), rather than being constrained to the first few letters.

The data are suggestive of children’s processing being delayed compared to the skilled adult readers, with a developmental change in the time course of pre-processing: adults showed early effects in first fixation duration, whilst the two groups only patterned similarly in later processing. This is consistent with children's rate of lexical processing being slower than that of adults, as found by the E-Z Reader model when used to simulate adults’ and children’s eye movement behaviour during reading (Reichle et al., 2013). If children are slower to process word n then it stands to reason that they will also be slower to pre-process information from n+1. Consequently, each word in the sentence is pre-processed to a reduced degree, and is processed at a slower rate during direct fixation for a child compared to an adult. It is, therefore, unsurprising that children's overall reading times on words were longer, and that effects were delayed in children compared to adults.

This study provides strong evidence for the importance of external letters in children's lexical identification, consistent with skilled adult readers. As shown by collapsing, respectively, the internal and the external letter substitutions together, both adults and children benefitted from faster reading times when the internal letters (1ddd56 and 12ddd6), relative to the external letters (ddd456 and 123ddd), were substituted. Thus, consistent with the literature on skilled adult reading (White et al., 2008), the identity of a word's external letters facilitated children's parafoveal pre-processing more than its internal letters. With respect to syllabic boundaries, the conditions that substituted letters in both syllables of a word (1ddd56 and 12ddd6) were less disruptive to pre-processing than conditions that substituted letters in just one syllable (ddd456 and 123ddd). Thus, external letters are critical to parafoveal pre-processing, to a far greater degree than any pre-processing of syllabic structure.

These results are consistent with Grainger and Ziegler’s (2011) model of orthographic processing. Both the adults and the children, albeit delayed, appeared to be using coarse-grained orthographic processing.Footnote 8 The benefits gained from the internal letter substitutions, relative to the external letter substitutions, suggest that both groups were not sensitive to the absolute precise ordering of letters in preview, but were rather coding for the most visible letters that best constrained word identity and facilitated lexical identification – the external letters. This is broadly supportive of flexible letter position encoding models (e.g., SOLAR, Davis, 2010; SERIOL, Whitney, 2001).

The delay in the children’s pre-processing of orthography (preview benefit) compared to the adults could be due to orthographic representations being less precisely encoded in the children (e.g., Perfetti, 2007). When letter substitutions were present in preview this came at an immediate cost to the adults compared to the identity condition, whilst this effect was delayed in the children. If orthographic forms are less precisely encoded in children, they would experience less of an immediate cost when orthography is manipulated in preview, in contrast to the adults with their more precisely encoded orthographic representations, who would be more reliant on the presence of whole-word orthography in preview (as provided by the identity condition). Consequently, there would appear to be a developmental change in the tuning of orthographic word-recognition processes (e.g., Castles, Davis, Cavalot, & Forster, 2007).

One unexpected result was the lack of a first-letter bias in the adults, that is a more important role in preview for the first letter than the final letter, as found in previous studies (e.g., White et al., 2008), though when first and final letters were collapsed into a single, “external” condition, this was significantly different to internal letter substitutions (consistent with previous research). The present study did ultimately find though that the first letter of the target words was important to adults’ pre-processing (albeit not more so than the final letter); substituting the first letters in preview (ddd456) came at a significant cost relative to the identity condition. It may be that the finding of a first-letter bias depends on the exact nature of the experimental manipulation. Most research has looked at letter transpositions, not substitutions (e.g., Johnson & Eisler, 2012; Rayner, White, Johnson, & Liversedge, 2006b; White et al., 2008). Importantly, though, Johnson et al. (Experiment 3; 2007), showed that both first-letter transposition and substitution previews were detrimental to reading times.Footnote 9 Consequently, we would have expected an effect of first-letter substitutions in the adults. The lack of this effect could be due to the stimuli which, here, were specifically designed for children and would, therefore, have been very easy for the skilled adult readers. The adults’ ease of processing for these sentences may have resulted in a greater degree of parafoveal pre-processing for the target word than would be the case with more difficult sentences (e.g., Henderson & Ferreira, 1990). Thus, the adult readers may have allocated their attention across the entire form of n+1 (not just the initial letters). For the adults, consequently, both the first and final (external) letters were important to their pre-processing.

Children, similar to the adults, displayed sensitivity to first-letter substitutions very early in their lexical processing – in first-fixation duration. The 30-ms preview benefit effect found within this measure in the children was comparable in size to the effect found within the adults (35 ms). This suggests that the privileged status of the first letter/s to lexical identification is evident very early in both adults’ and children’s lexical processing, especially given how this information was manipulated parafoveally. Whilst the adults, though, did not show a first-letter bias (comparing ddd456 against 123ddd), the children did. This evidence for the importance of the first letter in children’s pre-processing is consistent with Pagán et al. (2016) and Johnson et al. (2018), who found numerical trends for a bias towards the first bigram of target words in all dependent measures for children.Footnote 10 Overall, the evidence strongly suggests that the first letter/s of words are important for facilitating children’s lexical identification in preview.

There are several reasons why the first letter of a word might be particularly important for lexical identification. One possibility is reduced lateral masking, or crowding, due to the inter-word space on one side, whilst internal letters are subject to greater lateral masking from the presence of other letters on both sides (e.g., Bouma, 1973; Levi, 2008). Alternatively, it could be more cognitively based, in that identification of the first letter of a word could drive the process of lexical identification. Certainly, Johnson and Eisler’s (2012) research, with adults, suggests this could be the case. For example, they found that when lateral masking was equated by replacing inter-word spaces with #s (e.g., The#boy#could#not#solve#the#problem#so#he#asked#for#help.), first letter transpositions were still significantly more difficult for readers than internal transpositions, whilst final letter transpositions were no more harmful than the internal transpositions (Experiments 1 and 2). This suggests a critically important role for the first letter of a word in lexical identification, irrespective of low-level visual factors like crowding. This finding contrasts with effects associated with a word’s final letter.

In summary, the present study provides novel evidence of children pre-processing whole words during English reading, and experiencing costs from external letter manipulations in preview, similar to adults. External letters appear to play a specific and important role in visual word recognition, seeming to fundamentally relate to how both adult and child readers access lexical information.