Over the past 30 years, numerous studies have illuminated the characteristics of eye movements, and the mechanisms that direct the eyes toward novel information: In reading, this equates to directing the eyes toward upcoming words in the visual periphery (for overviews, see Rayner, 1998, 2009). Much less is known about regressions—that is, eye movements in the direction opposite to normal reading. Regressions make up 15 %–25 % of eye movements during normal reading (e.g., Rayner & Pollatsek, 1989): Most are fixations of the word immediately to the left of the last-fixated word (Vitu & McConkie, 2000), but a minority are longer-range regressions to an earlier word or to an earlier segment of the text. Regressions are planned and executed differently from forward-directed saccades (cf. McConkie, Kerr, Reddix, & Zola, 1988; Vitu & McConkie, 2000), and their nature and function is not well understood. Although they go against the normal order of word processing, they do not impede comprehension (Kolers, 1968).

In this article, we will test two accounts of the function of longer-range regressions (i.e., of more than one word; shorter-range regressions may serve different functions; Vitu, 2005). The first, and dominant, account argues that readers make regressions to reread words that they have already fixated or skipped. The second, more recent account holds that regressions serve to cue readers’ memory for words that they have already read. According to this account, actual rereading would be merely an incidental by-product of a regression. We expound on both accounts below.

The most obvious explanation of these longer-range regressions (see Rayner, 1998) is that they allow the reader to reread information that they have missed, forgotten, or are unsure about (hereafter, the “rereading hypothesis”). This hypothesis seems consistent with the evidence: Readers make more regressions when the text is complex (Rayner & Pollatsek, 1989), when the topic changes (Hyönä, 1995), when the text contains grammatical errors or ambiguities (Inhoff, Greenberg, Solomon, & Wang, 2009), or when they encounter information that disambiguates the preceding text (Blanchard & Iran-Nejad, 1987; Frazier & Rayner, 1982). All of these effects suggest that readers make regressions to reread or check previously read words.

Another way to account for regressions is to assume that they help the reader reinstate a cognitive action that is associated with a cognitive process that originally occurred at the regressed-to location (Kennedy, 1992); in other words, they serve to cue the reader’s memory for what has previously been read (hereafter, the “deictic pointer hypothesis”; see Ballard, Hayhoe, Pook, & Rao, 1997; Spivey, Richardson, & Fitneva, 2004). Readers often have a surprisingly good memory for the position of information on the page: Kennedy and Murray (1987) found that readers could make accurate regressions 60 character spaces in length. Reading a long text imposes considerable demands on working memory; it is plausible that readers could use words’ positions on the page as a kind of “external memory” (see O’Regan, 1992). By returning to a location on the page, some of the properties of the text might be activated (namely, their spatial properties), making it easier to retrieve information associated with that location (Ferreira, Apel, & Henderson, 2008). Therefore, the deictic pointer hypothesis leads to the premise that moving the eyes back to the location of a previously read word is sufficient to cue the reader’s memory for that word, rendering actual rereading unnecessary.

It was Hebb (1968) who originally proposed that eye movements are important in memory, specifically in visual working memory (WM) and imagery. Eye movements follow similar patterns when viewing a stimulus and when later inspecting that same stimulus using imagery (Borst & Kosslyn, 2008; Brandt & Stark, 1997). Participants fixate the empty space where referred-to objects have previously been seen (Altmann, 2004; Altmann & Kamide, 2009; Ferreira et al., 2008) and while trying to recall stimuli that had previously occupied that location (Richardson & Spivey, 2000; Spivey & Geng, 2001; Theeuwes, unpublished data, as cited in Theeuwes, Belopolsky, & Olivers, 2009). However, from these studies it remains unclear whether the eye movements are functional or simply a by-product of retrieving a visual memory (see Altmann, 2004; Ferreira et al., 2008, for similar arguments).

More direct evidence that eye movements aid memory retrieval has come from studies in which retrieval was impaired by eye movements away from the location of the to-be-remembered item. Hale, Myerson, Rhee, Weiss, and Abrams (1996) asked participants to remember the locations of objects: When participants had to make a lateral eye movement following the presentation of each object, their spatial memory performance was impaired. Similarly, Lawrence, Myerson, Oonk, and Abrams (2001) found that saccades to a flashing peripheral cue disrupted spatial WM; importantly, unfixated cues did not disrupt WM, suggesting that this effect is truly related to eye movements and not an artifact of distraction. This conclusion was supported by Lawrence, Myerson, and Abrams (2004), who found that a spatial shift of attention did not disrupt WM as much as a physical eye movement. Similarly, Postle, Idzikowski, Della Salla, Logie, and Baddeley (2006, Exp. 4) found that visual WM was specifically vulnerable to disruption by eye movements, but not by other secondary tasks.

Perhaps the strongest evidence for the importance of eye movements in memory has come from Laeng and Teodorescu (2002, Exp. 2), who asked participants to view a grid containing images of tropical fish. The participants either had to maintain a central fixation or could move their eyes freely. The images were then removed, and the participants were asked about the physical characteristics of the fish. Participants who could move their eyes during viewing were more likely to move their eyes during recall, whereas those who fixated centrally during viewing tended also to fixate centrally during recall. Most importantly, participants who could move their eyes during viewing but were restricted to a central fixation during recall remembered significantly fewer details about the fish than did the other groups (although see Spivey & Geng, 2001). This suggests that making similar eye movements during viewing and recall has a functional role in memory recall, and that eye movements are not just by-products of retrieving visual memories (Mast & Kosslyn, 2002).

The evidence therefore suggests that eye movements may be capable of cuing recall. Reading seems an ideal task on which to use this ability, as it requires the encoding of large amounts of information that is bound to a series of specific spatial locations on the page. The deictic pointer account of regressions is therefore plausible, and given the uncertainty surrounding the function of regressions, it deserves careful consideration. Many of the characteristics of regressions (e.g., that they occur when readers are unsure about the meaning of the sentence or when the text’s topic changes) can equally well be explained by the deictic pointer hypothesis as they can by the rereading hypothesis.

The rereading and deictic pointer hypotheses are not mutually exclusive: Regressions may benefit reading by allowing rereading as well as by cuing memory. It is important to investigate whether deictic pointers are indeed used during reading and, if so, their importance in programming regressions. This question was explored by Inhoff and Weger (2005, Exp. 3), who showed participants a fact-defining statement sentence, followed by a question sentence, which asked the participant about some precise detail from the statement sentence. On half of the trials, the statement sentence was removed when participants fixated the question sentence (this is a variation of the gaze-contingent boundary paradigm; see Rayner, 1975). This prevented rereading but arguably still allowed participants to cue their memory for the sentence by making a regression. When the sentence disappeared, participants made a regression on only 4 % of trials, as opposed to 43 % when the sentence remained visible. The fact that readers made many fewer regressions when the sentence could not be reread provides initial evidence that rereading is the major purpose of regressions. Although this task was unlike normal reading and the regressions that were elicited may be functionally different from those in normal reading, this was an advantage for testing the deictic pointer hypothesis, since the task encouraged regressions, and using deictic pointers would provide a performance advantage. The fact that readers did not use deictic pointers in this situation suggests that they do not use them in normal reading. The present research aimed to further explore this result.

Experiment 1

Inhoff and Weger’s (2005) experiment supports the rereading hypothesis, but it has a methodological limitation: The statement sentence disappeared completely before the participant answered the comprehension question, which made rereading impossible. However, deictic pointers—which presumably relate to the positions of the words—may also have disappeared. The deictic pointer hypothesis includes the assumption that readers can more easily recall a previously read word if they fixate its location, but if the sentence disappears completely, it may be more difficult to do this, since spatial cues—such as adjacent words—are not available (although see Altmann, 2004; Altmann & Kamide, 2009). Thus, participants may not have regressed because they knew that they could not use deictic pointers to retrieve the answer, not because they knew that they could not reread the answer. Therefore, Inhoff and Weger’s results can also be accommodated by the deictic pointer hypothesis.

Our Experiment 1 eliminated this problem by changing the letters in the statement sentences, rather than removing them. For example, participants were presented with the statement sentence “My mother came from England and my father from Hungary”; after they read this statement, it was replaced with “Gi sowmty gorm hore Yngrwio ptg de ugewse huim Muvfrew.” This made rereading the statement impossible. However, the now-illegible statement preserved deictic pointers, because each word’s position, length, and capitalization were still available; if participants wanted to fixate a word from the statement, they could easily do so, they just could not actually read it. The deictic pointer hypothesis therefore yields the predictions that participants should have regressed when the statement was illegible and that regressing should improve their accuracy on the comprehension questions. However, the rereading hypothesis yields the prediction that when the statement sentence was illegible, participants should have realized that they could not reread the statement and therefore made no regressions; likewise, any regressions that they did make should not have improved their accuracy on the comprehension questions. Importantly, legibility was blocked, so participants were always aware whether the statement would be legible.

The present research also extends the work of Inhoff and Weger (2005) by manipulating visuospatial WM load: Participants remembered the locations of letters during the reading task. If readers regressed to cue their memory for words’ semantic content, visuospatial WM was presumably necessary for retaining these words’ locations. Also, eye movements have been selectively linked to spatial WM in the literature (Lawrence et al., 2001; Lawrence et al., 2004; Postle et al., 2006). Therefore, the deictic pointer hypothesis predicts that loading visuospatial WM with a secondary task should disrupt reading and comprehension performance in several ways. Firstly, it should be more difficult to retain the locations of words, and so regressions should be less accurate. Deictic pointers might also be less likely to be formed or be more fragile. Finally, more regressions may be made under high load, to alleviate the additional WM load from retaining the locations of words.

On the other hand, the rereading hypothesis yields the prediction that visuospatial WM load should not affect regression frequency or comprehension accuracy. Although high load might make it harder to accurately regress to the target word, there was no time limit: Participants would eventually find and reread the target word. When the statement sentence was illegible, rereading could not aid comprehension accuracy, regardless of load; therefore, no effect of visuospatial WM load on comprehension accuracy would be predicted, regardless of legibility. Note that visuospatial WM load should not greatly affect comprehension during initial reading of the statement sentence, since normal reading utilizes phonological more than visuospatial WM (e.g., Calvo & Eysenck, 1996).

These two changes ensured a more thorough test of the rereading and deictic pointer hypotheses.

Method

Participants

A group of 22 undergraduates from the University of Kent participated for course credit. All were native English speakers with no vision or reading problems.

Design

The experiment employed a 2 × 2 × 2 repeated measures design. The factors were Statement Legibility (legible or illegible), Target word’s Position in the sentence (early or late), and WM Load (high [five items] or low [two items]).

Apparatus and stimuli

The experiment was conducted using SR Research’s Experiment Builder and an EyeLink 1000 eyetracker. Participants sat supported by a chin- and head-rest 68 cm from a 19-in. ViewSonic G90fB monitor.

The sentences were taken from Inhoff and Weger (2005; see Appx. A below). There were 40 statement sentences, each from five to ten words long, and containing two potential target nouns. Each statement sentence was followed by one of two question sentences. For example, the statement “My mother is younger than my father” was followed by either “Who was born earlier?” in which case father became the target, or “Who was born later?” in which case mother became the target.

Each statement had a legible version, described above, and an illegible version. This was made by substituting every letter in the sentence. Spaces, punctuation, and capitalization were preserved so as to allow accurate regressions within the illegible sentences. The illegible sentences were identical for all participants and did not contain any real words or pronounceable nonwords.

Text was presented in 12-point Courier font. The letters subtended approximately 0.25°. The question sentence directly followed the statement sentence, on a single line.

For the WM task, 20 high- and 20 low-load study phase stimuli were prepared. Each stimulus consisted of Xs, spread irregularly on a single line (see Fig. 1). In the high-load condition, there were five Xs; in the low-load condition, there were two. Stimuli were also prepared for the recognition test phase; these were either identical to their respective study phase stimulus, or identical except that one X had been moved slightly to one side. There were equal numbers of identical and nonidentical stimuli.

Fig. 1
figure 1

Trial sequence for Experiment 1

Procedure

The participants were tested individually. After a briefing, the eyetracker was calibrated using three fixations: Gaze was only measured in the horizontal dimension. Calibration was validated using another three fixations and was repeated if any fixation was inaccurate by more than 0.5°. A drift correction was performed before each trial; if this was inaccurate by more than 0.5°, the tracker was recalibrated.

Each trial began with the WM study phase stimulus, which was presented for 2 s (see Fig. 1); participants had to remember the positions of the Xs. This was followed by the statement and question sentences, which were presented as a single line of text, for 5 s. Gaze position was recorded during this time. The participant gave a verbal answer to the question sentence; this was recorded by the experimenter, who sat out of sight. Participants were told to concentrate on accuracy, as the speed of response was unimportant. In the illegible condition, the computer switched the legible statement for its illegible version when the participant’s gaze crossed an invisible boundary between the statement and question sentences. This could happen at any time, and did not affect the timing of the trial.

Next, the WM test phase stimulus was presented for 5 s, or until the participant made a response. Each participant pressed the “R” key if the test stimulus was identical to the study phase stimulus, and “W” if it was not.

There were two blocks of 20 trials, one for the illegible condition, and one for the legible condition. Block order was counterbalanced. Sentences were randomly allocated to the conditions of the Target Position and WM Load factors and were presented in a random order.

Dependent variables

The dependent variables were the accuracy on the comprehension task; the proportion of trials on which regressions were made, where a regression was defined as a saccade from the question sentence back to the statement sentence; regression depth, or the distance in character spaces from the end of the statement sentence to the landing position of the regression (this was not the overall length of the regression, but a measure of how deeply it penetrated back into the statement sentence); and regression error, which was the distance in character spaces between the regression’s landing position and the center of the target word. These measures were calculated for the first and the second regressive fixation for each trial, as readers would often require two saccades to fixate the target of a regression (Inhoff & Weger, 2005). Fixations preceded and/or followed by a blink were excluded when calculating the dependent variables.

Results and discussion

The participants took M = 3,530 ms (SD = 1,136) to read the statement sentence, fixating for M = 257.83 ms (SD = 153.83) and making saccades of M = 10.90 letter spaces (SD = 11.27). Participants fixated the target word for M = 625.41 ms (SD = 444.14) before fixating the question, and fixated the target for M = 260.62 ms (SD = 175.42) on the first pass through the sentence; the target word was skipped on 23.64 % of the trials. None of these values were affected by Legibility (ts < 1.5, ps > .14). While initially reading the statement sentence, participants made M = 4.58 (SD = 2.11) regressions in the legible condition and M = 4.08 (SD = 1.93) in the illegible condition, t(861) = 3.56, p < .001. However, Legibility did not affect the number of regressions that they made to the target word, M = 0.93 (SD = 1.12), t(861) = 1.25, p = .21.

Regressions from the question back to the statement sentence were launched from M = 11.49 (SD = 7.95) letter spaces into the question text, for the illegible condition, and from M = 13.18 (SD = 7.80) letter spaces in, for the legible condition. The WM Load manipulation was successful: Participants were significantly more accurate in the low-load condition (M = .65, SD = .48) than they were in the high-load condition (M = .57, SD = .50), t(816) = 2.32, p < .05.

Linear mixed models

The number of regressions varied significantly between participants, F(21, 858) = 5.31, MSE = 0.07, η 2 = .12, p < .001. To control this noise, and because cell ns were very uneven, we analyzed the data with mixed models, treating individual trials as cases (see Janssen, 2012; Richter, 2006). We tested each dependent variable with the model

$$ \matrix{ {{\text{Dependent}}\,{\text{Variable}} = {\beta_0} + {u_0} + {\beta_1} \cdot {\text{Statement}}\,{\text{Legibility}} + {\beta_2} \cdot {\text{Target Position}} + {\beta_3} \cdot {\text{WM}}} \\ {\,{\text{Load}} + {\beta_4} \cdot \left( {{\text{Statement Legibility}} \times {\text{Target}}\,{\text{Position}}} \right) + {\beta_5} \cdot \left( {\text{Statement}} \right.} \\ {\left. {{\text{Legibility}} \times {\text{WM Load}}} \right) + {\beta_6} \cdot \left( {{\text{Target Position}} \times {\text{WM Load}}} \right) + {\beta_7} \cdot } \\ {\left( {{\text{Statement}}\,{\text{Legibility}} \times {\text{Target}}\,{\text{Position}} \times {\text{WM}}\,{\text{Load}}} \right) + {\text{Error}},} \\ }<!end array> $$
(1)

where β 0 is the intercept and u 0 is the random effect of participants upon the intercept. Restricted maximum likelihood estimation was used. Including the effects of interest yielded a significantly better fit than the null model (all likelihood ratios > 29, ps < .001) for all dependent variables except comprehension accuracy (likelihood ratio = –15.6, p < .05); for simplicity’s sake, all dependent variables were analyzed in the same way. The participants effect was at least marginal for all dependent variables (all Zs > 1.3, ps < .10). Standard ANOVAs with trials as cases yielded near-identical results. Descriptive statistics are presented in Tables 1.

Table 1 Dependent variables for Experiment 1

Participants were significantly less likely to make regressions in the illegible than in the legible condition, F(1, 851) = 65.34, p < .001. No other effects were significant on the proportion of trials with regressions (all Fs < 1.7, ps > .19). This suggests that participants usually made regressions when they knew that the statement information was available to reread and supports the rereading hypothesis. The fact that participants did not make more regressions under high WM load refutes the prediction that using deictic pointers reduces the load on WM, and thus contradicts the deictic pointer hypothesis.

Regression depth was affected by a Statement Legibility × Target Position interaction, F(1, 658) = 4.64, p < .05, so that participants made deeper regressions to early target words when the statement was legible. In other words, participants were less likely to make deep, targeted regressions in the illegible condition. This supports the conclusion that regressions were motivated by the desire to reread information. There was also a Statement Legibility × WM Load interaction, F(1, 665) = 7.23, p < .01, so that legibility had a larger effect in the low-load condition. This effect cannot be predicted from either hypothesis and may simply indicate that, when WM load was high, participants were more likely to forget that the statement was unavailable, or to regress out of habit. There were also main effects of statement legibility, F(1, 659) = 58.48, p < .001, and target position, F(1, 658) = 44.05, p < .001. No other effects were significant (all Fs < 0.4, ps > .5). For the second fixation of the regression, the main effects of statement legibility, F(1, 597) = 28.24, p < .001, and target position, F(1, 592) = 84.79, p < .001, remained, but there were no other significant effects (all Fs < 2.1, ps > .1).

We found a Statement Legibility × Target Position interaction on regression errors, F(1, 662) = 43.21, p < .001: Regressions were less accurate to early than to late targets, but this effect was more pronounced in the illegible condition. This is consistent with the idea that regressions had no benefit in the illegible condition, so participants had no reason to make deep, targeted regressions. There were also main effects of target position, F(1, 661) = 186.17, p < .001, and statement legibility, F(1, 663) = 14.98, p < .001. No other effects were significant (all Fs < 1.6, ps > .2). The results were very similar for the second fixation of the regression.

WM load did not affect accuracy on the comprehension task, F(1, 778) = 1.50, p = .22, and did not interact with statement legibility, whether a regression was made, or their combination (all Fs < 1.2, ps > .28). Deictic pointers would presumably require visuospatial WM, so these null results suggest that participants were not using deictic pointers in this task. However, these null effects are predicted from the rereading hypothesis. Although null results are not usually interpretable, our interpretation here is supported by other, significant effects in this experiment.

Correlations among dependent variables

Importantly, there was a negative correlation between comprehension task accuracy and regression error, but only for the legible condition, r(389) = –.13, p < .01, and not the illegible condition, r(292) = .01, p = .81. When participants made a regression, they were more likely to subsequently give the correct answer when the regression landed close to its target; however, regressions could not improve comprehension task performance when the statement could not be reread. The second fixation of the regression showed the same relationships. These findings support the rereading hypothesis over the deictic pointer hypothesis.

Comprehension accuracy was higher when participants did not make a regression, Φ(778) = –.09, p < .01. In the illegible condition, regression depth correlated negatively with comprehension accuracy, r(292) = –.20, p < .001. Both results suggest that participants made regressions when they were unsure of the correct answer; the latter suggests that targeted regressions were not helpful if the statement sentence could not be reread. These results therefore also support the rereading hypothesis. Regression depth and comprehension accuracy did not correlate in the legible condition, r(389) = .07, p = .14, because in this condition participants could make targeted regressions, did so more frequently and accurately, and targeted regressions could assist them in giving the correct answer (see Table 1).

Experiment 1 supported the rereading hypothesis more than the deictic pointer hypothesis. Participants made fewer regressions, made shallower and less accurate regressions, and made more comprehension task errors when the statement information was illegible; WM load did not affect comprehension task accuracy. This suggests that readers make regressions to reread information. However, the deictic pointer hypothesis also received some support, in that participants made a large number of regressions in the illegible condition—despite legibility being blocked, so participants knew that they could not reread the target word—and these were somewhat targeted: Regressions were aimed deeper into the statement sentence when the target word appeared earlier in that sentence. This may indicate that regressions both allow rereading and activate deictic pointers. However, it is also possible that participants made regressions in the illegible condition out of habit, or because they forgot that the statement information was illegible (although this is unlikely). A further possibility is that the “regressions” recorded in the illegible condition often reflected a simple recentering of the gaze: The question sentences ended toward the right edge of the screen, and we might not expect participants to hold their gaze to the right at this point. Participants were also answering a question, so the “regressions” may have reflected a kind of gaze aversion (e.g., Doherty-Sneddon & Phelps, 2005). This recentering account fits with the interactions between legibility and target position on regression depth and on regression error: Untargeted recentering fixations would land farther from early targets, which were nearer the left edge of the screen, than from late targets.

Experiment 2 sought to distinguish whether regressions made when the statement sentence was unavailable reflected accessing of deictic pointers or were simply artifacts of habit, forgetting the statement was unavailable, or recentering the gaze.

Experiment 2

In Experiment 2, we employed a “simulated” reading task, in which the statement sentence was never presented onscreen. Instead, nonlinguistic markers were used as visuospatial reference points for each word in the statement sentence. Rereading of target words was impossible for the entire experiment, minimizing the chance of participants forgetting that the statement was unavailable. However, deictic pointers were available. Therefore, if participants made regressions in Experiment 2—and if these regressions aided their comprehension performance—this would suggest that participants were using deictic pointers.

Method

Participants

A group of 32 students from the University of Kent (26 female, 6 male) from 18 to 48 years of age (M = 20, SD = 5.49) participated for course credit. All were native English speakers with no vision, hearing, or reading problems.

Design

For the experiment we employed a 2 × 2 repeated measures design. The factors were Target Position (early or late) and WM Load (high [five items] or low [two items]).

Apparatus and stimuli

The apparatus and the statement and question stimuli were the same as those used in Experiment 1. To avoid confusion with the simulated words (see below), the WM stimuli were changed from Xs to Os; the WM stimuli were otherwise identical to those used in Experiment 1.

Audio recordings were made of every word from the statement stimuli. Each recording was 600 ms long and consisted of a single word spoken in a uniform, male voice.

Procedure

As in Experiment 1, participants carried out a WM task and a sentence comprehension task simultaneously. Each trial began with the WM study phase, followed by a fixation cross, presented on the left-hand side of the screen for 1 s. Next, the statement was presented by playing the appropriate single-word recordings, each immediately following the last. At each recording’s onset, the simulated word “XXXX” appeared on the screen, so that as the auditory statement was presented, a corresponding simulated statement gradually appeared onscreen, with the same word length as the real statement. Participants were instructed to follow this simulated statement with their eyes and to imagine that they were really reading the statement from the screen. Once the statement had been presented, the question text appeared on the same line as the simulated statement, which remained visible. Participants answered the question aloud. Participants were then presented with the WM test phase and responded with a keypress.

Forty trials were presented, with ten randomly allocated to each of the four cells of the design. Five additional, practice trials were presented before the real experiment began; there was no break or signal between the practice and experimental trials.

Dependent variables

As before, these were comprehension task accuracy, the proportion of regressions made, regression depth, and regression error. Fixations preceded or followed by a blink were excluded from the analysis.

Results and discussion

Reading times, fixation times, and saccade lengths were dictated by the task. Nonetheless, participants fixated the target simulated word for M = 615.25 ms (SD = 370.51); the target simulated word was skipped on 31 % of trials. These figures are comparable to those from Experiment 1 and suggest, firstly, that participants were following the instructions correctly and, secondly, that they had ample opportunity to form deictic pointers. Due to a programming error, WM accuracy was not recorded. Regression launch sites were M = 9.52 (SD = 11.20) letter spaces from the beginning of the question sentence.

Linear mixed models

Regression behavior was influenced by individual differences: Some participants made many regressions, whereas others made few or none. To control this noise, we analyzed each dependent variable using the model

$$ \matrix{ {{\text{Dependent}}\,{\text{Variable}} = {\beta_0} + {u_0} + {\beta_{{1}}} \cdot {\text{Target}}\,{\text{Position}} + {\beta_2} \cdot {\text{WM}}\,{\text{Load}} + {\beta_3} \cdot \left( {{\text{Target}}\,{\text{Position}}} \right.} \hfill \\ {\left. {\quad \quad \quad \quad \quad \quad \quad \quad \quad \times \,{\text{WM}}\,{\text{Load}}} \right) + {\text{Error,}}} \hfill \\ }<!end array> $$
(2)

where β 0 and u 0 denote the intercept and the random effect of participants on the intercept, as before. For regression errors and regression depth, including the effects of interest yielded significantly better fit than the null model for both the first and second fixations of the regression (all likelihood ratios > 22.7, ps < .001). For the proportions of regressions made (likelihood ratio = –12.4, p < .01) and comprehension task accuracy (likelihood ratio = –1.1, p < .05), including the effects of interest did not improve fit. The effect of participants was at least marginally significant for all of the dependent variables (all Zs > 1.5, ps < .07), except for comprehension accuracy and regression errors of the first regressive fixation; for simplicity, all dependent variables were analyzed with mixed models (Table 2).

Table 2 Dependent variables for Experiment 2

Comprehension task accuracy was high (M = .94, SD = .24) and was not affected by WM load, F(1, 294) = 0.10, p = .75, or by its interaction with target position, F(1, 294) = 0.36, p = .55. These null results are predicted by the rereading hypothesis; if participants were using deictic pointers to answer the comprehension questions, they should have found this more difficult under high visuospatial WM load and made more errors.

The proportion of regressions (M = .25, SD = .43) was unaffected by target position or WM load (all Fs < 0.4, ps > .50). The facts that regressions were much less common in Experiment 2 and that their probability was not affected by the independent variables suggest that participants were not using regressions to aid their comprehension performance, and so refutes the deictic pointer hypothesis. In general, comprehension accuracy was high but not at ceiling, yet regressions were not common. This too refutes the idea that regressions serve any function besides rereading. However, it is possible that participants would have had more need to use deictic pointers—and that WM load effects would have been detected—had the comprehension task been substantially more difficult.

Regression depth was unaffected by target position, F(1, 151) = 0.03, p = .86. This implies that participants were not aiming their regressions toward the target words. The regressions recorded in this experiment may have been artifacts of habit or reflexive returns of the gaze to the center of the screen. If these regressions did facilitate answering the comprehension question, they did not do so by allowing fixation of the target word. Either way, these results do not support a deictic pointer account of regressions in normal reading. Regression depth was also not affected by WM load, nor by the Load × Target Position interaction (both Fs < 1.6, ps > .2). This suggests that participants were not forgetting that the target word could not be reread, as this would have happened more frequently under high WM load. The second regression’s depth was also unaffected by the independent variables (all Fs < 2.2, ps > .14).

The regression error of the first fixation was affected by target position, so that regressions were less accurate for early (M = 31.18, SD = 13.32) than for late (M = 9.82, SD = 8.87) targets, F(1, 157) = 7.57, p < .01. Given that regressions were not aimed toward target words (see above), this effect may simply reflect the fact that early target words were farther from the center of the screen (M = 34 letter spaces) than were late targets (M = 5 letter spaces), and so were more likely to register a large regression error if participants fixated a random location on the screen. There were no other effects on this dependent variable (Fs < 0.7, ps > .4).

The regression error of the second regression fixation was affected by WM load, so that fixations were less accurate under high load (M = 19.83, SD = 16.23) than under low load (M = 18.96, SD = 17.04), F(1, 83.5) = 4.39, p < .05. This effect is difficult to interpret, given the small effect size (d = 0.21) relative to the substantial average regression error. It is possible that, if participants did attempt to fixate the target word in a minority of trials, a high visuospatial WM load made it more difficult to retain that target word’s position. There were no other effects on this dependent variable (Fs < 2.8, ps > .1).

Relationships between dependent variables

Participants were again less accurate when they made a regression (n = 314; M = .90, SD = .30) than when they did not (n = 966; M = .95, SD = .22), t(1278) = 2.69, p < .01, d = 0.21 (t is corrected for unequal variances), suggesting that participants regressed when they were unsure of the answer. However, these regressions did not assist recall of the statement sentence: There was no correlation between regression error and comprehension accuracy for either fixation of the regression (|r|s < .03, ps > .60).

This experiment utilized an artificial reading situation in which only visuospatial information, not word information, was available to participants if they made a regression. Participants were then encouraged to make a regression, by their having to answer a specific question about the statement. Even in this situation, in which deictic pointers would be especially useful, participants did not appear to use them. Although deictic pointers can apparently be used to aid retrieval of other types of information (see the introduction), this experiment provided no evidence that this is the function of regressions in normal reading (Ferreira et al., 2008). It did provide some support for the argument that the large number of regressions in Experiment 1’s illegible condition may have partially reflected recenterings of the gaze (although note that regressions were deeper and more than three times more likely in Exp. 1’s illegible condition than in Exp. 2, leaving the possibility that participants used deictic pointers in Exp. 1 but not in Exp. 2).

The possibility remains that the “reading” situation in Experiment 2 was too artificial to capture normal reading processes. Note that participants performed better on the comprehension task here than they did in Experiment 1’s illegible condition, which employed a more naturalistic reading task, suggesting that they approached the two “reading” situations differently. Regressions during normal reading may allow for both rereading and the use of deictic pointers. Experiment 3 tested the rereading hypothesis in a more naturalistic reading situation.

Experiment 3

Experiment 3 was another attempt to thoroughly test the rereading and deictic pointer hypotheses. We assumed that readers are most likely to make a regression for the purpose of rereading if they are unsure about the word to which they are regressing. For example, they may forget the word and need to reread it to correctly understand the sentence. Participants should therefore not notice if the word that they regress to is different from the word that they previously read. In Experiment 3, participants read statement sentences similar to those in Experiment 1. In another variation on the gaze-contingent boundary paradigm (Rayner, 1975), we monitored their eye movements and, if they made a regression, we changed a word. If the rereading hypothesis is correct, participants should usually be unaware of this change. We measured this awareness in two ways: Firstly, we asked participants, after the experiment, whether they had seen any changes. Secondly, we measured whether they understood the original or the changed meaning of the sentence, by presenting both meanings as response alternatives in a comprehension test, after the statement sentence had been removed from the screen.

Note that the regressions measured here were within-sentence regressions, shorter than those studied in Experiments 1 and 2; these may not function in the same way. However, these regressions are naturally occurring, rather than being induced by a comprehension question, so in Experiment 3 we were able to assess the function of regressions in a more realistic situation than before.

Method

Participants

A group of 19 students (11 female, 8 male) at the University of Kent, aged 19 to 24 (M = 21.63, SD = 1.07) took part in the experiment, either for course credit or as a favor to the experimenter. All were native English speakers with no vision or reading problems.

Design

A repeated measures quasi-experimental design was used, comparing trials in which participants regressed to the target words to trials in which they did not (see the Results and Discussion section). The dependent variables were whether the participant chose the original meaning or the changed meaning on the comprehension test, and how many critical word changes they were aware of during the whole experiment.

Apparatus and stimuli

The eyetracker apparatus was the same as that described previously. The stimuli were adapted from those used before (see Appx. B). Forty sentences were presented, each with a pre- and postchange variant. The difference between these was that one target word was changed. For example, one prechange sentence was “Andy is a good driver but his cousin David is not.” This sentence’s postchange variant was “Andy is a good dancer but his cousin David is not.” The pre- and postchange words were of equal lengths. For the comprehension test, three alternative interpretations were presented, one matching the prechange meaning of the sentence (e.g., “ Andy is safer on the road”), one matching the postchange meaning (e.g., “Andy has good rhythm”), and one baseline option that matched neither variant of the sentence (e.g., “Andy is safer in the water”).

Procedure

Participants were told that the experiment was investigating eye movements in reading. They were not told about the changing words until the end of the experiment.

Each sentence was presented once, in a random order. The sentence was presented onscreen for 2 s. During this time, participants’ eye movements were monitored. If a regression was made, the sentence switched to its postchange variant. Regressions were detected by means of an invisible boundary; if participants fixated any point more than ten character spaces beyond the critical word, and then made any right-to-left saccade, a regression was deemed to have been made. The change was implemented within one screen refresh, was made during the saccade itself, and the saccade started some distance away from the critical word (cf. Binder, Pollatsek, & Rayner, 1999), so participants were unable to see the change itself (Matin, 1974).

After 2 s, the sentence was replaced with the comprehension test. The three alternative interpretations (see above) were presented together in a random order, numbered from 1 to 3. Participants pressed the appropriate key on the keyboard to select their answer. They were instructed to take their time and to try to get every answer correct. Note that the statement sentence and the comprehension test were never visible at the same time, and therefore that participants could not reread the statement once they knew the comprehension questions.

After the experiment, participants were probed for suspicion about the word changes, and the number of changes that they detected was recorded.

Results and discussion

Since we were more interested in participant-level variables than before, we switched to a more conventional analytic strategy, treating participants as cases. The specific hypotheses were tested with pairwise comparisons, comparing trials on which participants regressed to the target word with trials on which they did not. Participants made regressions on M = 37.37 out of the 40 trials and fixated the critical word during a regression on M = 5.74 trials. Due to a clerical error, specific fixation data were only available for ten participants: These indicated that participants fixated the target word for M = 366.86 ms (SD = 256.65) during their first reading of the sentence (i.e., before any regressions were made) and skipped the target word on 34.28 % of trials. Overall, the mean fixation while reading the statement sentence was 243.06 ms (SD = 162.51), and the mean saccade length was 11.52 letter spaces (SD = 9.44). The mean regression length was 10.31 (SD = 9.46) letter spaces.

Participants were more likely to select the original meaning of the sentence when they did not make a regression or did regress but did not fixate the target word (mean probability = .70, SD = .106), than when they did regress and fixate the target word (M = .26, SD = .251), t(18) = 8.66, d = 1.99, p < .001. Similarly, they were less likely to select the changed meaning when they did not regress to the target (M = .20, SD = .094) than when they did (M = .68, SD = .275), t(18) = 8.16, d = 1.87, p < .001. In other words, fixating the changed target word strongly affected participants’ interpretation of the sentence, as predicted from the rereading hypothesis.

All but two of the participants noticed the target change at least once (M = 22.68 changes reported, SD = 30.79); however, participants’ estimates of the number of words that changed during the experiment were not related to the number of changes that had occurred (M = 37.24, SD = 1.79), r(19Footnote 1) = –.02, p = .95, nor to the number of changed words that they fixated (M = 5.71, SD = 2.49), r(19) = .22, p = .36. Although these are null results, they do imply that participants were unaware of the changes most of the time. Together with the significant results above, these findings strongly support the rereading hypothesis.

General discussion

In three experiments, we examined whether readers make regressions to reread words or to access deictic pointers and cue their memory for what they have previously read. The results showed that (a) when previously read sentences are made illegible, which prevents rereading but retains deictic pointers, readers make fewer regressions, make less accurate regressions, and make more comprehension errors; (b) when sentence information is never presented visually but is instead accompanied by nonlinguistic deictic pointers, readers do not regress; and (c) when readers regress during normal reading, they are more influenced by what they read the second time than by what they read the first time.

These findings strongly suggest that readers make regressions to reread, rather than to cue their memory for those words. In Experiments 1 and 2, we used quite artificial reading tasks that rather encouraged regressions to be made, so their results may not reflect normal reading. However, these tasks were designed to be more likely to detect the use of deictic pointers, and yet they failed to do so. The literature already contains data on why regressions occur and what they achieve functionally: They seem to help the reader clarify ambiguities (e.g., Hyönä, 1995). The present findings elucidate how regressions achieve this function. Existing models that incorporate regressions, such as the E-Z Reader model (Pollatsek, Reichle, & Rayner, 2006; Reichle, Warren, & McConnell, 2009), accurately predict when regressions are likely to occur (Inhoff et al., 2009) but do not specify whether rereading is the motive, or using deictic pointers. The present experiments clarify this point.

The rereading and deictic pointer hypotheses are not mutually exclusive: Regressions may serve both purposes in normal reading. Several results in the present experiments may support this conclusion: firstly, the relatively high number of regressions in Experiment 1’s illegible condition, and the fact that some of these regressions were targeted; secondly, some regressions were made in Experiment 2, despite the fact that the statement sentence was never available in this experiment (although these regressions were not aimed toward the target words). We have offered alternative accounts for these results, but these require empirical verification.

Given that eye movements seem to cue memory in other tasks (Laeng & Teodorescu, 2002; Postle et al., 2006) and that eye movements are generally affected by the contents of visuospatial WM (Hollingworth & Luck, 2009; Hollingworth, Richard, & Luck, 2008), if deictic pointers are not typically used in reading, the pertinent question is why. Logically, it seems that reading—with its high demands upon WM and the stable positions of the words on the page—would be an ideal task on which to utilize deictic pointers, but the present results suggest that the average reader is reluctant to do so. There could be at least three explanations for this.

Firstly, it may be easier to reread than to use deictic pointers when WM load is high. Comprehending a long, complex sentence—the kind of sentence on which readers are likely to regress—loads WM. Droll and Hayhoe (2007) measured eye movements during a sorting task and found that when WM load was high, participants tended to switch from holding the positions of bricks in WM to repeatedly seeking them with eye movements. Irwin and Zelinsky (2002) found that their participants could remember the locations of only three to five objects in WM; reading a sentence typically requires many more fixations than this, so perhaps it is not surprising that readers prefer to reread information, rather than using a memory-based strategy, for ambiguity resolution.

Secondly, it may be that readers cannot use deictic pointers because fixating the previously fixated word overwrites the word information held in memory.Footnote 2 This would make it impossible to use deictic pointers, even if readers wanted to. We know that processing fixated words can be all but unavoidable in some situations (MacLeod, 1991). Indeed, eye movements to blank space are used strategically to avoid automatically processing information: When answering questions, participants often avoid their questioner’s gaze, especially when the questions are difficult (Glenberg, Schroeder, & Robertson, 1998). This apparently prevents cognitive overload that might result from fixating the questioner’s face and searching for the answer at the same time (Doherty-Sneddon & Phelps, 2005). When reading, any regression will fixate text, and it may be very difficult for the participant to avoid rereading the fixated word.

The third, and possibly most likely, explanation is that in normal reading, the word to which the reader regresses always remains available; readers need not learn to use deictic pointers, since this gives little advantage over active rereading. Perhaps future generations of readers, with more experience of reading scrollable text on small screens, may be more inclined to use deictic pointers, as in this situation rereading is often impossible, yet the memory for where a word previously appeared would be unaffected by scrolling. Future research must elucidate the tasks for which deictic pointers could aid performance, and under what circumstances they could facilitate reading.

The present results leave us more certain about the function of longer-range regressions. However, note that, as demonstrated by Vitu and McConkie (2000), the majority of regressions fixate the word immediately to the left of the preregression fixation; about a quarter of regressions fixate an earlier portion of the same word (Rayner & Pollatsek, 1989). The motivation for such short-range regressions may differ from that of the longer-range regressions studied here, perhaps by relying more heavily on deictic pointers; alternatively, immediate perceptual or semantic processes may be the dominant factor. Future research must clarify such differences.