Advertisement

Attention, Perception, & Psychophysics

, Volume 73, Issue 1, pp 266–283 | Cite as

Saccade control in natural images is shaped by the information visible at fixation: evidence from asymmetric gaze-contingent windows

  • Tom FoulshamEmail author
  • Robert Teszka
  • Alan Kingstone
Article

Abstract

When people view images, their saccades are predominantly horizontal and show a positively skewed distribution of amplitudes. How are these patterns affected by the information close to fixation and the features in the periphery? We recorded saccades while observers encoded a set of scenes with a gaze-contingent window at fixation: Features inside a rectangular (Experiment 1) or elliptical (Experiment 2) window were intact; peripheral background was masked completely or blurred. When the window was asymmetric, with more information preserved either horizontally or vertically, saccades tended to follow the information within the window, rather than exploring unseen regions, which runs counter to the idea that saccades function to maximize information gain on each fixation. Window shape also affected fixation and amplitude distributions, but horizontal windows had less of an impact. The findings suggest that saccades follow the features currently being processed and that normal vision samples these features from a horizontally elongated region.

Keywords

Eye movements Scene perception Attention Saliency Natural vision 

Introduction

The human’s visual environment is extremely rich. At any one time, people are faced with a continuous array of information comprising important or potentially useful items amidst a background of less informative noise. The visual system’s answer to this complexity is twofold. First, the retinas encode the whole visual field in a non-uniform manner: Spatial resolution is greatest at the fovea and decreases rapidly, meaning that objects in central vision are processed in fine detail, while neural resources are spared the intensive task of representing the whole environment at this level of precision. Second, a series of fast eye movements are then programmed to align the high-resolution fovea with different parts of the visual array. The efficiency of the visual system at processing the parts of the environment most important for the current task, therefore, depends crucially on its ability to make efficient eye movements. Specifically, the eye guidance system must compute where to move the eyes in order to process important regions, but this computation can only be an estimate based on the low-resolution preview of the periphery. Although the resolution of the visual system drops off exponentially as a stimulus is moved further from the current fixation, researchers often divide the visual field into the fovea (within about 1° of fixation), the parafovea (between about 1° and 5° from fixation), and the periphery (more than 5° from fixation; see, e.g., Larson & Loschky, 2009).

In this study, we examined global changes in eye movements during a scene-encoding task, by manipulating the extent of the scene that could be processed on each fixation, using a gaze-contingent display. With this aim in mind, we will first review some of the previous research investigating eye guidance in scenes and the use of gaze-contingent displays.

Eye guidance in natural scenes

Two of the earliest studies of eye movements used pictures of natural scenes and identified two important facts about where people look in such images (Buswell, 1935; Yarbus, 1967). First, fixations are not uniformly distributed but cluster on points of interest (e.g. faces and objects). Second, eye movement patterns change depending on the viewer’s task. Subsequent researchers have sought to determine what aspects of the image or the task influence the decision of where to move the eyes (for a review, see the recent special issue: Tatler, 2009).

One approach has been to identify the features commonly found at fixated locations (Reinagel & Zador, 1999) and use these features to compute a saliency map of conspicuous points in the image (Itti & Koch, 2000). This model predicts that people will look at the most salient points in the image, and the implication is that the visual system is computing saliency from peripheral information and using this as an estimate of the most important places to fixate. The saliency map model can predict eye movements better than chance, and it has the advantage of being applicable to any arbitrary image (Foulsham & Underwood, 2008; Peters, Iyers, Itti, & Koch, 2005). However, even the highest estimates of the correlation between saliency and fixation are small, and because image-statistical approaches are fundamentally correlational, the question of whether saliency actually causes fixation selection often remains unanswered. Furthermore, the whole idea of a model of eye guidance based only on image features is called into question by the demonstration that eye movements are highly dependant on the observer’s task. In search tasks, for example, participants are able to ignore salient regions and look towards regions that are similar to the target, as well as to areas where they expect targets to be found, given the context (Chen & Zelinsky, 2006; Foulsham & Underwood, 2007; Torralba, Oliva, Castelhano, & Henderson, 2006).

The models of eye movements that have been developed on the basis of search data can be considered top-down, in the sense that they possess task-relevant knowledge (normally, target features) independent of the actual stimulus (Navalpakkam & Itti, 2005; Rao, Zelinsky, Hayhoe, & Ballard, 2002; Torralba et al., 2006; Zelinsky, 2008). For example, in the Rao et al. model, saccades are programmed to locations showing the highest correlation with the target, producing efficient search, as well as less intuitive eye movement behaviour, such as centre-of-gravity fixations that land between objects. The Torralba et al., model combines bottom-up saliency, guidance by target features, and contextual guidance: a spatial prior to bias attention towards areas where targets are likely to be found (when one searches for pedestrians, search should be concentrated on the street and not the sky). Kanan et al. (2009) extended these ideas with their saliency using natural statistics (SUN) model, which incorporates contextual guidance and probabilistic maps based on object appearance.

Najemnik and Geisler (2005) took a different approach with their ideal observer model of search. Rather than programming eye movements to locations resembling the target, this model emphasizes that an optimal searcher will place fixations that maximize the information gained. For example, because the human visibility map is horizontally elongated (i.e. empirically measured detection performance drops off with eccentricity more rapidly above and below fixation), the ideal searcher will often move to the top or bottom of the display, where target presence is more uncertain, in order to maximize the information that can be gained given the horizontal visibility map. This model successfully matched several aspects of human performance at searching for a sinusoid amid 1/f noise, including search times and some global eye movement behaviours (Najemnik & Geisler, 2008). This approach highlights the importance of foveated models, which take into account the limited resolution in the periphery. The predictiveness of both top-down and bottom-up models improves when this fundamental anatomical feature of human vision is included (Parkhurst, Law, & Niebur, 2002; Peters et al., 2005; Zelinsky, 2008).

Although these models are useful examples of top-down guidance, it is unclear how they can be applied to tasks other than search (or even to search where target features are not completely known). For example, Foulsham and Underwood (2007, 2008) used a memory-encoding task, where participants are asked to view a series of scenes in preparation for a memory test; and Underwood, Jebbett, and Roberts (2004) and Foulsham, Kingstone, and Underwood (2008) used a picture–sentence verification task where participants had to verify the accuracy of a sentence that appeared after the image it was describing. What determines where people look in these tasks, in which there is no explicit target? Some of the highest correlations between saliency and fixation are found in free-viewing tasks (although even in such tasks, the correlations are very weak and can be overidden by top-down demands; see, e.g., Einhauser, Rutishauser, & Koch, 2008). This is perhaps because, in the absence of a target, visual saliency coincides with places that are useful for interpreting or remembering the scene. A rather different approach to investigating eye movements in such tasks is to consider the general spatial biases that occur in saccade selection. Fixations tend to be biased towards the centre of most images, and this can be dissociated from photographer bias and the distribution of visual features (Foulsham & Underwood, 2008; Tatler, 2007). Saccades tend to move horizontally, and their amplitudes show a characteristic, positively skewed distribution. The trend for horizontal saccades is seen even in square images, and it changes as the scene is rotated, demonstrating that it is related to scene content and is not a fundamental property of the oculomotor system (Foulsham et al., 2008). These systematic tendencies of eye movements in scenes can potentially predict where people will fixate just as well as image-based models (Tatler & Vincent, 2009). It is therefore important to consider the causes of these tendencies and, in particular, the role of central and peripheral information. This article looks at how biases in saccade selection during an encoding task are altered by the use of a gaze-contingent viewing paradigm. We first review this technique.

The use of gaze-contingent displays

A gaze-contingent display is one that is updated in response to the viewer’s eye movements. This technique allows the experimenter to manipulate the information available at different eccentricities. In reading, the gaze-contingent moving-window design masks text outside a window that is centred on fixation. By varying the size of the window, the perceptual span at which reading can still proceed normally can be assessed (McConkie & Rayner, 1975; see Rayner, 1998, for a review). In search, a gaze-contingent window has been used to manipulate the target-similar features present in the periphery (Pomplun, Reingold, & Shen, 2001). Masking reduced the degree to which search was guided, supporting the idea that guidance operates preattentively and in parallel. In scene perception, Saida and Ikeda (1979) and Shioiri and Ikeda (1989) used a moving-window design to identify the useful field of view for picture memorisation—the size of the area around fixation which is actually used for perception. Memory performance improved as the size of the window increased, although large windows of around 10° in diameter elicited performance similar to that for normal viewing. Interestingly, an analysis of eye movements suggested that the useful field of view tended to overlap on consecutive fixations, with around 75% of the saccades moving to points that could be processed on the previous fixation. Multi-resolutional displays take the gaze-contingent technique further by allowing the resolution of the scene to be steadily degraded as a function of increasing eccentricity (Loschky & McConkie, 2002; Reingold, Loschky, McConkie, & Stampe, 2003). When the function relating eccentricity and resolution is lower than or matches that in the human visual system, this is not detected by the observer (Loschky, McConkie, Yang, & Miller, 2005). Some of these results were confirmed by Geisler, Perry, and Najemnik (2006), who varied the drop-off in peripheral resolution while observers searched for a target in noise. The authors’ ideal searcher model produced similar behaviour, given the same eccentricity limitations. Masking the periphery outside a moving window also seems to lengthen fixation durations (Greene, 2006), particularly with small windows (Loschky & McConkie, 2002). van Diepen and d'Ydewalle (2003) found that while masking the region at fixation affected fixation durations most severely (as would be expected if fixations are dominated by processing at the current location), peripheral masking also had an effect.

Despite this research, there are surprisingly few studies using gaze-contingent displays to study issues of saccade control in scenes, certainly as compared with reading research (Rayner, 2009, made a similar observation while reviewing the literature). In particular, despite evidence for asymmetries in eye movements (e.g., the predominance of horizontal saccades) and in the processing of information at fixation (which appears to be horizontally elongated; Najemnik & Geisler, 2005), no study has yet varied the shape and symmetry of a gaze-contingent window in scenes.

The present research

In the present research, we compared eye movements during a scene-encoding task in several different gaze-contingent conditions, and we looked for effects on trends in saccade direction and amplitude. A particular point of interest was the bias for horizontal eye movements. What causes this bias? If it is due to the distribution of features in the periphery, it should be reduced or eliminated in viewing where the visible information is restricted and equal in all directions (as in a gaze-contingent display with a symmetrical square or circular window). Here, we contrasted normal viewing with a symmetrical moving window and two asymmetrical window conditions (Fig. 1). How would altering the shape of the window around fixation affect the saccades made?
Fig. 1

The viewing conditions from Experiment 1: Normal unconstrained viewing; viewing with a square gaze-contingent window; and viewing with rectangular, asymmetric windows oriented horizontally or vertically. The display is shown for one image on the first fixation (represented by a white cross for demonstration only)

On the basis of previous research, we would expect scanning to suffer with the gaze-contingent window manipulations, leading to shorter saccades and longer fixations. The models discussed above suggest two potential determinants of saccade selection, and these lead to two hypotheses in the case of an asymmetric window. First, if saccades are targeted towards features whose importance can be detected from the current fixation location, currently visible regions of the scene will have a greater influence than unseen areas. In the window conditions, the only features available are within the window, so we would expect short saccades that target things within this region. If the features of potential saccade targets are represented as points on a spatial map, the saliency of points outside the window will be reduced (see also Loschky & McConkie, 2002). In the case of asymmetric windows, there are more features in one direction, and this should therefore result in a change in the distribution of saccade directions: more horizontal saccades with a horizontal window and more vertical saccades with a vertical window.

A second possibility is that saccades are chosen to maximize the new information gained on each fixation. How should we define information maximization in an encoding task? The best strategy in such a task would be to look at as much of the scene as possible. We therefore propose that maximizing the information gained means moving to a location where the most new features can be seen. If this were the case, we would expect more vertical saccades in the horizontal window condition and more horizontal saccades in the vertical window condition. This pattern would “reveal” more of the image with each gaze shift by its moving to locations that were invisible (where information was zero) on the previous fixation. Our experiments investigated the effects of the differently shaped gaze-contingent windows on saccades during encoding, with particular emphasis on distinguishing between these possibilities.

The results of Najemnik and Geisler (2008) emphasised that, for a full picture of the scanning process, saccade direction, saccade amplitude and fixation location distributions need to be analysed. In their study, they found that although fixation positions were biased to the top and bottom of the display (consistent with their ideal information maximization model and a horizontal visibility map), horizontal saccades were most common. These results could be reconciled by looking at the saccade amplitudes: Infrequent but large vertical saccades moved fixation to the top or bottom of the display, which was then explored with smaller but more common horizontal movements. The authors speculated that this strategy may also be due to our experience with natural scenes, where objects are often found on the horizon. To test the generalizability of this claim, we also looked at the interaction between window shape and scene type (landscapes vs. interiors), since we had found differences in these stimuli previously (Foulsham et al., 2008).

Experiment 1

Method

Participants

Inclusion in this study was contingent on having normal vision (without glasses) and on completing a good calibration on the eyetracker. Sixteen participants (9 females; age range, 18–24 years) took part for course credit and gave their informed consent.

Stimuli, apparatus and design

Eighty colour photographs showing indoor or outdoor scenes were used, half of which were presented in both the encoding phase and the test phase. All the images were high resolution and were collected from the Internet and commercially available collections and were resized to 1,024 × 768 pixels. Each encoding image was matched with a correct sentence (describing the state or position of something in the scene; e.g. “There is a towel on the bath”) and an incorrect sentence (e.g. “There is a towel on the floor”).

Eye movements were recorded using the Eyelink 1000 eyetracker (SR-Research). Participants were seated at a chinrest that ensured a constant viewing distance of 60 cm from the screen and eliminated head movements. Stimuli and instructions were presented on a 19-in. monitor with a 60-Hz refresh rate, the frame of which was visible throughout the experiment. The screen subtended approximately 30° × 25° of visual angle. Images were shown full-screen, and participants used a gamepad to respond after each trial. Eye movement events were parsed using the default EyeLink 1000 algorithm, which identified saccades where the velocity of the eye position signal was greater than 30°/s and acceleration was above 8,000°/s2.

All the participants saw the same images in a random order. Four viewing conditions were used, and these were presented within participants in a blocked fashion, counterbalanced among participants: normal viewing, square gaze-contingent window, horizontal window and vertical window (see Fig. 1).

Stimuli in the normal condition were presented full-screen without modification. The gaze-contingent window conditions were continuously updated in response to the participants’ eye movements, and this process was controlled by EyeLink’s Experiment Builder software with custom Python programming. In each case, the stimulus consisted of a grey mask filling the screen, with a gaze-contingent window overlaid. This window had dimensions of 6.2° × 6.2°, 12.5° × 3.1° and 3.1° × 12.5° in the square, horizontal and vertical conditions, respectively. The area of the window was therefore the same in all the gaze-contingent conditions. A portion of the image of this size was cropped and centred on the current fixation, and this moved with the participant’s gaze, creating a moving window through which the image could be explored (see Fig. 2 for a description). On the basis of the size of these windows, visible information extended from the fovea to the parafovea (considered to be between about 1° and 5° from fixation) and, in the case of the elongated windows, into the periphery. A conservative estimate of the average time lag between an eye movement and the updating of the display is 24 ms (which included calculation of eye position, processing of the new image and monitor refresh rate). This lag is unlikely to have been detectable by observers (who are often unable to detect changes at lags of 80 ms; Loschky & Wolverton, 2007).
Fig. 2

The trial sequence during the encoding phase of Experiment 1. In the normal condition (bottom), participants saw a fixation point, followed by a scene presented for 10 s. This was then replaced with a sentence about the image that had to be verified. The procedure in the gaze contingent conditions (top) was the same, but the display was modified to show only a clear window at the current gaze location (white circle for demonstration only). When a saccade was made (white arrow), the window followed the point of gaze

Stimuli appeared in all four conditions, across participants, and at encoding, images were equally likely to be paired with a correct or an incorrect sentence. At test, the images from encoding were presented again, interleaved with the same number of unseen images.

Procedure

Following calibration with a 9-point grid, two practice trials were given in order to familiarize the participants with the gaze-contingent display. The experiment proper then began with the encoding phase (Fig. 2).

Participants were shown four blocks of ten images, one block for each of the four viewing conditions. Each encoding trial began with a central fixation point, which participants were required to fixate before the trial began and which therefore ensured that scanning started in the centre. The image then appeared and remained on the screen for 10 s. Participants were instructed to inspect the scene and try to remember it for the sentence verification task. Following the image, a sentence appeared that could be correct or incorrect with regard to the previous scene. Participants were required to press one of two keys on the gamepad to indicate whether the sentence was correct or not, and this keypress terminated the display and initiated the next trial.

When all the encoding trials were complete, participants were given a surprise memory test for the images, which we will use as an additional measure of how well image encoding can proceed under the different viewing conditions. Participants were instructed to view each image and decide whether they had seen it previously in the encoding block. All 80 images (half of which were the ones seen at encoding) were then displayed in a random order. Each test trial began with a fixation point, followed by the presentation of the scene. The image remained on the screen until the participant made an old/new judgment by pressing one of two keys on the gamepad. The experimenter continued to monitor the validity of the eyetracker calibration, and it was recalibrated after encoding and whenever necessary to maintain a good calibration.

Analysis and results

We used participants’ memory as a preliminary indicator of encoding performance. In the subsequent recognition test, scenes were correctly recognized on 75% of trials (mean false alarm rate = 12%), and accuracy did not vary reliably with the viewing condition at encoding, F(3, 45) = 1.1, p = .34. There was a marginally reliable effect on correct recognition time, F(3, 45) = 2.7, p = .059. Recognition was fastest when the stimulus had been seen under normal conditions (mean RT = 2,990 ms) or when a square window had been used at encoding (3,014 ms). The asymmetric conditions were associated with the slowest performance (horizontal = 3,765 ms; vertical = 3,747 ms). Thus, scene encoding was worse in the gaze-contingent conditions, which is consistent with previous reports (Saida & Ikeda, 1979). Our subsequent analyses concentrated on the way in which the scenes were scanned with saccadic eye movements.

We first looked at eye movement measures across the whole trial: the number of fixations, their mean duration, and the mean amplitude of the saccades made. We then focused on our main question by looking at saccade direction and amplitude in the different conditions. In this study, we were concerned only with behaviour at encoding, where there was a fixed trial time. In each case, we compared the different viewing conditions, using a within-subjects ANOVA, with post hoc Tukey tests, which compensate for the familywise error associated with making multiple comparisons, being used to compare between each pair of conditions where necessary.

General eye movement measures (Table 1)

Viewing condition had a significant effect on the number of fixations made on each trial, F(3, 45) = 5.47, p < .05, but no reliable effect on the average fixation duration, F(3, 45) = 2.10, p = .11. Gaze-contingent trials tended to have more fixations, with a slightly shorter duration, than normal viewing. Pairwise comparisons demonstrated that the square window elicited reliably more fixations than did both normal viewing, q(15) = 4.3, p < .05,and viewing with a vertical window, q(15) = 5.6, p < .01. No other comparisons were reliable.
Table 1

Measures quantifying general eye movement behaviour during the scene-encoding task

  

Normal

Square window

Horizontal window

Vertical window

Number of fixations per trial

M

33.55

37.38

35.75

35.23

SE

0.88

1.07

0.84

0.86

Fixation duration (ms)

M

258

235

244

251

SE

11.1

7.4

7.1

7.8

Saccade amplitude (°)

M

6.7

3.9

4.6

3.7

SE

0.22

0.21

0.23

0.21

There was a highly reliable effect of condition on the mean saccade amplitude, F(3, 45) = 117.96, p < .001. The gaze-contingent conditions were characterized by saccades several degrees shorter, on average, than in normal viewing, all qs(15) < 13, all ps < .01. However, saccades in the horizontal condition were not as short as those in the vertical and square conditions, both qs(15) < 8, both ps < .01; a horizontal window did not produce such a severe change in the size of scanning movements. The vertical and square conditions did not differ reliably.

Saccade direction

Our analyses of saccade direction did not include any saccades that were shorter than 1°, so as to exclude readjustive saccades, microsaccades or minor artifacts of the eyetracker. Figure 3 illustrates the relative frequency of saccades in each direction, binned into 36 bins of 10°. All of the conditions contained a high proportion of horizontal saccades, but there was a change in the vertical condition, with more vertical saccades being made in those trials.
Fig. 3

Saccade direction distributions for the four viewing conditions. The extent of each plot shows the frequency of saccades in each of 36 direction bins. A leftward saccade is labeled zero, and all other directions are numbered clockwise from this. The shaded areas on the first plot indicate the arcs used to count vertical (shaded) and horizontal (unshaded) saccades

To perform statistics, we divided the full range of directions into four 90° arcs, centred on the cardinal directions (see the shaded regions in Fig. 3). We first confirmed the symmetry of the plots in Fig. 3. There was no difference in the frequency of upward versus downward saccades, and no difference in the frequency of leftward versus rightward saccades, in any of the conditions, all ts(15) < 1. As a result, we collapsed all the saccades into two categories, vertical and horizontal, and computed the frequency of saccades in each category for each participant in each condition. Finally, we calculated the proportion of horizontal saccades (hereafter, the HVP, calculated as the frequency of horizontal saccades divided by the frequency of all saccades). An HVP of 1 would indicate that all the saccades made were horizontal, whilst an HVP of 0 would show complete dominance of vertical eye movements.

In normal viewing, with no gaze-contingent window, there was a mean HVP of .70 (SE = .01). This quantifies the horizontal bias, which was present in all participants and reliably different from an equal proportion of saccades in each direction (one-sample t test against an HVP of .5), t(15) = 22.0, p < .001. Viewing condition had a reliable effect on the HVP, F(3, 45) = 11.44, p < .001. The vertical window produced a significantly smaller horizontal bias than in normal viewing (M ± SE = .59 ± .03), q(15) = 5.4, p < .01. However, the horizontal and square conditions were not significantly different from normal (.72 ± .01 and .67 ± .01, respectively). The vertically oriented window elicited reliably more vertical and fewer horizontal saccades, leading to a lower HVP than for the other shapes, both qs(15) > 4, both ps < .05. The square window resulted in behaviour somewhere between that in the two rectangular conditions, with a less pronounced horizontal tendency than in the horizontal condition, q(15) = 4.9, p < .05. These findings demonstrate that the window shape modified the frequency of saccades made in different directions.

These results are taken from the whole 10-s trial. However, saccade dynamics can change over time as more is learnt about the scene and objects are inspected in detail (Unema, Pannasch, Joos, & Velichkovsky, 2005). We therefore broke down the saccade distribution by ordinal saccade number (Fig. 4). We analysed only the first ten saccades, in order to see how quickly the effects of condition became apparent and because there were no other deviations in the effects of condition on later saccades. The first data point in this figure, for example, shows the mean proportion of first saccades that moved horizontally. A 4 (condition) × 10 (saccade) repeated measures ANOVA indicated that over the first ten saccades there was an effect of condition, F(3, 45) = 7.0, p = .001, but no effect of saccade number, F(9, 135) = 1.5, p = .17. These main effects were qualified by an interaction: The change in scanning with a different-shaped window was not equal on all saccades, F(27, 405) = 3.1, p < .001. Of particular interest is the very first saccade, which tended to be horizontal in all the conditions except the vertical window condition, which elicited more vertical eye movements (different from all conditions), all qs(15) > 4, all ps < .05. The vertical condition continued to be different from the other viewing conditions on most saccades, although the other conditions showed a large degree of overlap. The simple main effect of condition was reliable on six of the first ten saccades, all Fs(3, 13) > 4.5, ps < .05, with the exception of the 2nd, 4th, 5th and 10th, where the conditions were not significantly different.
Fig. 4

Saccade direction over time. The lines show the mean horizontal:vertical proportion (HVP) for the first 10 saccades in the different conditions (the pattern was similar for the remaining saccades, but for clarity, the plot has been curtailed at the 10th saccade). An HVP of .5 (dotted line) indicates an equal proportion of horizontal and vertical saccades; higher values show that there were more horizontal eye movements

Saccade amplitude

In light of the differences in saccade direction, it is pertinent to ask how amplitude and direction interact as a function of the window shape. If the changes in the saccade direction distribution were due to saccades that target locations within the window, we would expect the majority of saccades to have amplitudes of less than the extent of the viewing aperture. Saccades larger than this would have been made towards the masked background and would, therefore, indicate strategic or top-down selection.

Figure 5 shows the histogram of horizontal and vertical saccade amplitudes. We considered the whole distribution, because it is possible that the window conditions might have led to bi- or multi-modal distributions (e.g. by increasing the frequency of small and large saccades at the expense of intermediate lengths) and because saccade lengths tend to be positively skewed. We compensated for this skew in our statistics by analyzing the participant medians.
Fig. 5

Saccade amplitude histograms for the different conditions. Each plot shows the relative frequency of saccades with the given amplitude, for both horizontal (grey) and vertical (black) eye movements. Charts from the gaze-contingent conditions mark the extent of the moving window in each direction (vertical dotted lines)

The amplitude distributions were unimodal and characterized by a few, very small saccades (of 1° or less), a majority of saccades of amplitude between about 1.5° and 4°, and a gradually decreasing frequency of larger eye movements. In normal viewing, horizontal and vertical saccades show similar distributions with a mode at about 1.5° and a median amplitude of 6.2° and 4.1° for horizontal and vertical saccades, respectively. The distribution for vertical saccades is sharper, with fewer long saccades: Only 17% of vertical saccades were over 8°, as compared with 30% of horizontal eye movements.

In comparison with normal viewing, the gaze-contingent window conditions had narrower distributions with a larger mode but fewer long saccades, resulting in lower medians. For example, the medians for horizontal and vertical saccades in the square condition were 3.4° and 3.0°, respectively, lower than those for normal viewing but showing the same trend as that for more large horizontal eye movements.

Looking at the bottom two panels in Fig. 5, it is clear that the asymmetrical window shapes led to a systematic change in saccade length. With a horizontal window, the distribution of horizontal saccades was more spread out and had a higher mode and more long-range saccades, leading to a higher average (median = 5.1°), relative to vertical eye movements (median = 2.9°). With the vertical shape, the opposite pattern was observed, and this was the only case where there were more large saccades moving vertically (median = 3.5°) than horizontally (median = 2.9°). An omnibus ANOVA performed on the participant medians confirmed that there was an effect of viewing condition, F(3, 45) = 76.6, p < .001. Direction was also reliable, with horizontal saccades resulting in a longer median, overall, than did vertical eye movements, F(1, 15) = 91.3, p < .001. However, these effects were qualified with an interaction, F(3, 45) = 89.4, p < .001. The median amplitude of horizontal saccades was greater than that of vertical saccades in normal viewing and on those trials with a square or a horizontal window, all ts(15) > 3.8, ps < .005. However, on trials with a vertical window, the median amplitude of vertical saccades was larger, t(15) = 3.6, p < .005.

The dotted lines in Fig. 5 indicate the extent of the moving window in each direction, which gives an idea of the frequency of saccades landing within versus beyond the window. We compared the landing site of each saccade with the coordinates of the aperture on the previous fixation. Although saccades in the gaze-contingent conditions were shorter than normal, about 50% of all the saccades went outside the window. In the square condition, 49% of the horizontal saccades and 37% of the vertical saccades went outside the window. The pattern in the horizontal condition (horizontal saccades, 37% outside the window; vertical saccades, 78%) was precisely the opposite of that seen on trials with a vertical window (horizontal saccades, 79%; vertical saccades, 32%). These observations suggest that it was perfectly possible for people to saccade to parafoveal or peripheral locations that were empty. In other words, the length of saccades was not completely curtailed by the presence of a gaze-contingent boundary, as evidenced by, for example, the tendency to make vertical eye movements beyond the edge of a horizontal window.

Conclusions from Experiment 1

There was a clear effect of the gaze-contingent viewing conditions, relative to normal viewing, and of the shape of the window. Image viewing with a moving window was characterized by more fixations and shorter saccades, and window shape had a differential effect on scanning direction and amplitude. There was a predominance of horizontal saccades in all the conditions, which suggests that this bias is not dependant solely on the visual features in the periphery (because it was also found in the masked-background conditions). However, a change from a horizontal window to a vertical one did change the pattern of saccade directions. A vertical window led to more vertical saccades: Participants preferred to move toward regions about which they already had some visible information. The distribution of saccade amplitudes shifted according to the boundaries of the window, although a significant number of saccades went beyond this boundary (i.e. into empty space). In the context of the memorization task, the moving-window conditions were detrimental to encoding and recognition, demonstrating that removing peripheral information had an impact on cognition.

In Experiment 1, the window conditions reduced the visual information available outside the aperture to zero (and in the case of the horizontal and vertical windows, this reduction was asymmetric). Complete masking of peripheral information is a rather artificial situation, and this may have been compounded in our experiment by the use of rectangular windows, which led to strong discontinuities, and straight edges at the boundary of the window. It is possible that the predominance of horizontal and vertical saccades was affected by these properties of the moving window or that it was unnatural for participants to saccade into empty space.

In Experiment 2, we used a more subtle manipulation to control the information available for planning saccades. We had two aims with this additional experiment. First, our aim was to replicate the changes in saccades found with vertical versus horizontal windows in a moving-window display without straight edges and with a less pronounced discontinuity between the window and the surround. Specifically, we used an elliptical window, and rather than mask the background completely, we presented high-resolution information at fixation and a low-pass-filtered (i.e. blurred) version of the image as a background. Second, we tested to see whether the effects of window shape would be moderated by the amount of information in the periphery. With a blurred background, all possible saccade targets contain some information, and the saccadic system must decide whether to move within the window, where visual information is preserved, or into the periphery, where current information is still present but is degraded. As previously, we manipulated the extent of preserved information in different directions by using a horizontally or vertically oriented window, and we explored whether the changes in saccade direction and amplitude remained. If the pattern of saccades with different window shapes is different—for example, if the vertical window no longer produces a higher frequency of vertical saccades—it would suggest that when some peripheral features are present, they are used by the saccadic system, perhaps in computations to maximize the information gained. Moreover, any differences between the experiments would demonstrate the importance of having something in the periphery, as opposed to nothing at all. One way that the extent of peripheral information might have an effect on eye movements is if scene type has an effect on the direction biases observed. We therefore also looked at saccade direction in both landscapes and interior scenes, for both experiments.

Experiment 2

Method

Participants

Twenty-six participants (18 females) took part in this experiment, none of whom had taken part in Experiment 1. All the participants were students (age range, 18–24 years), who took part for course credit and had normal vision.

Stimuli, apparatus and design

The 40 encoding stimuli from Experiment 1 were used again here. Each participant saw all the stimuli once, divided into two blocks of 20 trials. Half the stimuli were landscapes, and half were interiors. Two viewing conditions were used: viewing with either a horizontally or vertically oriented elliptical gaze-contingent window (Fig. 6). Viewing condition was blocked within participants, and block order was counterbalanced. Across participants, all the stimuli appeared equally often in both conditions.
Fig. 6

The two gaze-contingent conditions in Experiment 2. In each case, a high-resolution elliptical window (highlighted here by the dashed white boundary that was not in the actual stimuli) was presented at fixation, over a blurred background

The gaze-contingent display functioned in the same way as in Experiment 1. However, rather than a completely masked background, a low-pass-filtered version of the current stimulus appeared outside the window of fixation. The low-pass versions were produced in Adobe Photoshop by convolution with a Gaussian blur filter, the standard deviation of which was 0.5°. This relatively severe level of blur attenuated spatial frequencies greater than approximately 2 cycles/deg, frequencies which are well above perceptual thresholds, even at large eccentricities. This level of blur was chosen, on the basis of pilot studies, to be noticeable to participants, while still enabling general scene content to be determined from the blurred image (for manipulations of the degree of peripheral blur necessary for participants to become aware of the manipulation, see Loschky et al., 2005). The window around fixation was a regular ellipse with major axes of 12.5° and 3.1° (or vice versa for the vertical window), which were the same as the dimensions of the rectangular windows used previously.

Procedure

We again used the picture–sentence verification task from Experiment 1. Participants saw all images in a random order, with the instructions that they should “look carefully at the pictures so as to verify the accuracy of the following sentence”.

The procedure began with instructions, and the eyetracker was calibrated as previously. Two practice trials were then given, in order to familiarize participants with the apparatus and task. The experiment proper then began, and each trial proceeded in exactly the same way as in Experiment 1. No subsequent memory test was given.

Analysis and Results

Eye movements

We look first at general eye movement parameters before examining the saccade direction and amplitude distributions. Our comparisons of interest were (1) whether the orientation of the window (horizontal or vertical) affected viewing and (2) whether the effect in these blurred conditions was the same as with the rectangular, masked windows in Experiment 1. If the effects from the first experiment were caused by a reluctance to orient to a blank mask, the results with a blurred periphery should be more similar to normal viewing and should show less of an effect of window shape.

General eye movement measures

The mean number of fixations per trial, fixation duration and saccade amplitude were subjected to a 2 × 2 mixed ANOVA with the within-subjects factor of window shape (horizontal or vertical) and the between-subjects factor of experiment. For both number and duration of fixations, there were no differences between conditions or experiments and no interactions, all Fs < 2.3, ps > .14. Moreover, independent t tests yielded no differences in these measures between either condition in Experiment 2 and normal viewing in Experiment 1, both ts(40) < 1.

There was no main effect of experiment on saccade amplitude, F(1, 40) < 1: Masking and blurring led to saccades of a similar mean length. Although there was an interaction between experiment and window shape, F(1, 40) = 6.3, p < .05, saccades in Experiment 2 remained longer with a horizontal ellipse window (M = 4.4°, SE = 0.21) than with a vertical window (M = 4.0°, SE = 0.19), t(25) = 3.5, p = .001. This is the same difference as that seen in Experiment 1, although it was not as severe. In both conditions, saccades remained shorter than those seen in normal viewing, both ts(40) > 7, both ps < .001. This is consistent with the findings from Experiment 1: Gaze-contingent windows led to shorter saccades.

Thus, the window manipulation in this experiment produced results similar to those in Experiment 1, although the number and duration of fixations was relatively less affected (as compared with normal viewing) than in the fully masked conditions in Experiment 1.

Saccade direction

The pattern of saccade directions was very similar to that in Experiment 1: Most saccades moved left or right, but in the vertical condition, the eyes also moved up and down quite frequently (see Fig. 7).
Fig. 7

Saccade direction distributions for the horizontal and vertical ellipse conditions in Experiment 2. Data are plotted in the same way as in Fig. 3

The mean HVP was .71 (SE = .01) in the horizontal condition and .59 (SE = .01) in the vertical condition. There were more vertical saccades and fewer horizontal ones when the elliptical window was vertically oriented, t(25) = 8.8, p < .001. The vertical condition was reliably different from normal viewing, t(40) = 7.0, p < .001, but the horizontal condition did not differ reliably, t(40) < 1. There was no evidence that the pattern was any less pronounced than in Experiment 1 (no effect of experiment and no interaction; both Fs(1, 40) < 1).

Saccade amplitude

Figure 8 shows the distribution of saccade amplitudes for horizontal and vertical saccades in the two window shapes. We compared the participant medians (in order to account for the skew in the data), computing an ANOVA with 2 saccade directions ×2 window shapes, and with an additional between-subjects factor of experiment (to compare blurring and masking).
Fig. 8

Saccade amplitude histograms for the elliptical windows conditions in Experiment 2. The relative frequency of saccades of each amplitude is plotted as in Fig. 5. Dotted lines mark the maximal extent of the elliptical boundary in each direction

Overall, saccades were marginally shorter with the blurred display than in Experiment 1 (Experiment 1, M = 3.6°; Experiment 2, M = 3.2°), F(1, 40) = 3.6, p = .065. Main effects of direction, F(1, 40) = 32.3, p < .001, and window shape, F(1, 40) = 27.8, p < .001, and an interaction between them, F(1, 40) = 368.8, p < .001, confirmed the pattern in Experiment 1: With a horizontal window, horizontal saccades peaked at higher amplitudes than did vertical saccades (Ms = 4.2° vs. 2.9°), but with a vertical window, the pattern was reversed and vertical eye movements were longer (Ms = 3.6° vs. 3.1°). There was also an interaction between experiment and direction, F(1, 40) = 4.4, p = .04. Horizontal saccades had a higher median than did vertical saccades, but this difference was less pronounced in Experiment 2 (Ms = 3.3° and 3.1° for horizontal and vertical saccades, respectively) than in Experiment 1 (Ms = 3.9° and 3.3°). Finally, there was a three-way interaction, suggesting that the shift in the amplitude of saccades of different directions elicited by different windows was different in Experiment 2, F(1, 40) = 15.1, p < .001. However, although the differences were less pronounced in Experiment 2, breaking down the effects within this experiment showed the same result: Direction interacted with window shape, F(1, 25) = 144.2, p < .001. With a horizontal ellipse window, horizontal saccades (M = 3.8°) had larger median amplitude than did vertical saccades (M = 2.8°). With a vertical ellipse, the opposite was true (horizontal, M = 2.8°; vertical, M = 3.3°).

How often did saccades move outside the window of preserved visibility? Saccades were just as likely to move outside the window in Experiment 2 as in Experiment 1, t(40) < 1. As before, the direction of these saccades followed the orientation of the window (see also the dashed lines in Fig. 8) With a horizontal window, more vertical than horizontal saccades landed outside the window (69% vs. 37%), and with a vertical window, the opposite was true (39% of vertical saccades vs. 75% of horizontal saccades). Thus, blurring the peripheral information, as opposed to masking it completely, did not seem to affect the deployment of saccades outside the window. How were these eye movements controlled, given that the features at their landing site were invisible or degraded? The next section looks at the properties of these saccades in more detail.

The control of saccades beyond the window

We performed additional analyses of all the eye movements from the horizontal and vertical conditions (since these were the most comparable between experiments). First, we checked whether the trends in saccade direction held for the saccades outside the window. All of these eye movements were targeted at masked or blurred regions. If they also followed the direction of the window, it would indicate control based on memory or expectations of what was there (in the case of a masked periphery) or on limited, low-spatial-frequency information (in Experiment 2).

It was most appropriate for this analysis to compare saccades of similar amplitude, so we looked at all saccades longer than 6.5°, placing their endpoints beyond the window in all conditions from both experiments (Experiment 1, N = 2,304 saccades; Experiment 2, N = 6,313). Window shape continued to have an effect on the direction of these long saccades, F(1, 40) = 40.5, p < .001. However, this interacted with experiment, F(1, 40) = 24.6, p < .001. In Experiment 1, even large saccades were more likely to be horizontal with a horizontal window (mean HVP = .84) than with a vertical window (mean HVP = .52), paired t test, t(15) = 5.1, p < .001. This same trend was reduced in Experiment 2 (horizontal window, HVP = .71; vertical window, HVP = .67), although it did reach one-tailed significance, t(25) = 1.77, p = .04. Thus, the differences in saccade direction in Experiment 1 were also found in large saccades that were targeted at locations outside the window, even though there was no difference in the information at these points. However, when peripheral information was blurred, window shape had less of an effect on large saccades. On those occasions in which gaze moved outside the window, an increase in the information in the periphery ameliorated the affect of the gaze-contingent window on saccade direction.

We also analysed the properties of all eye movements landing outside the window, in order to test two specific predictions about the control of these saccades. First, it might take longer to initiate a saccade to these locations, perhaps because masking or blurring reduces the saliency of points beyond the window boundary. To test this, we looked at the duration of the fixation preceding the eye movement: Systematically longer fixation durations would suggest an increased preparation time for these saccades. Second, given that the saccades landed on targets that were masked or blurred, they should not be ideally positioned and might lead to a corrective eye movement. For example, people rarely fixate an empty background, but they may have erroneously done so if these regions were masked. If this happened often, participants may have terminated the resulting fixation early and made a short saccade to a more optimal position. We therefore computed the duration of the following fixation and the amplitude of the following saccade, and we predicted shorter fixations and smaller saccades after saccades landing outside the window. In each case, we compared the average for saccades landing within the window with that for saccades landing outside, in masked viewing (Experiment 1) and blurred viewing (Experiment 2), collapsed across horizontal and vertical window shapes.

Saccades outside the window were not associated with reliably longer prior fixation durations in either Experiment 1 (outside, M = 245 ms; inside, M = 249 ms) or Experiment 2 (outside, M = 249 ms; inside, M = 250 ms), both ts < 1. However, in Experiment 1, the fixation following a saccade outside the window (M = 232 ms) was reliably shorter than one directed within the window (M = 257 ms), t(15) = 6.5, p < .001. The same trend was also reliable in Experiment 2 (outside, M = 242 ms; inside, M = 251 ms), t(25) = 2.3, p < .05. The median amplitude of the saccade following an eye movement outside of the window was slightly shorter than that following an eye movement within the window in both experiments. This difference was negligible in Experiment 1 (outside, M = 3.65°; inside, M = 3.69°), t(15) < 1, but reached significance in Experiment 2 (outside, M = 3.13°; inside, M = 3.36°), t(25) = 3.2, p < .005. To summarize these results, whether a saccade moved inside or outside the window made no difference to the previous fixation duration. However, in line with our predictions, the subsequent fixation was shorter in duration and the following saccade had a smaller amplitude if an eye movement went outside the window.

The effect of scene type

We previously reported a reduced horizontal bias and more vertical saccades when participants viewed interior scenes than when they viewed landscapes (Foulsham et al., 2008). In the present research, the windowed conditions reduced the availability of peripheral scene content. Was saccade direction in these conditions sensitive to the type of scene? If the change in eye movements in interiors occurs because scene type is recognized from peripheral information, we would expect similar scanning in both landscapes and interiors in the moving-window conditions, particularly in Experiment 1, where the background was completely masked. Looking at the interaction between window shape and scene type will also help us explore the relationship of the windowed conditions to normal scene viewing.

Figure 9 shows the HVP calculated separately for the 20 images that were landscapes and the 20 interior scenes. There were reliable effects of scene type [Experiment 1, F(1, 15) = 11.4, p < .005; Experiment 2, F(1, 25) = 67.3, p < .001], with more horizontal saccades in landscapes than in interiors. Furthermore, there was no interaction with window condition [Experiment 1, F(3, 45) < 1; Experiment 2, F(1, 25) = 1.7, p = .2] and no scene type × experiment interaction, F(1, 40) = 2.3, p = .14. Thus, the sensitivity of eye movements to the type of scene does not depend on the availability of peripheral information.
Fig. 9

The proportion of horizontal saccades in landscapes (e.g. top left panel) and interiors (e.g. top right panel). Means (with standard error bars) are shown for each condition in the two experiments

Distribution of saccade endpoints

A residual question from our experiments concerns the location of fixations around the scene. This is important for two reasons. First, there are several biases known to affect the overall spatial distribution of fixations in an image, such as a central bias, and it is interesting to ask whether the gaze-contingent window modified these biases. Second, Najemnik and Geisler (2008) showed that both human observers and an ideal observer model tended to fixate in a “donut”-shaped region around the centre of a search display and, particularly, at the top and bottom of this ring. This pattern complemented their analyses of saccade amplitudes and direction: Although horizontal saccades were more likely in their study, they suggested that in order to maximize information, initial, infrequent vertical saccades moved fixation towards the top or bottom, which was then explored with more frequent but shorter horizontal saccades. In sum, fixations were most common at the top and bottom of the display, which, according to these authors, indicated optimal positioning of the horizontally elongated region of visibility. Thus, looking at the fixation distribution is a key way in which to distinguish between alternative interpretations of our own data.

We plotted the spatial distribution of saccade endpoints as a function of condition (columns in Fig. 10). To compare normal viewing with windows of a different shape, we will present this for the normal and square viewing conditions from Experiment 1, alongside the horizontal and vertical conditions from Experiment 2 (the distributions were similar in the asymmetrical window conditions from the first experiment). In each case, plots were generated by cumulatively adding a 2-D Gaussian patch (σ = 1°) at the location of the endpoint of each saccade. High values, represented by warm colours in the map, indicate the locations where fixations were most common. To explore trends over time, the rows in Fig. 10 split the data by saccade index within each trial (saccades 1–5, 6–10 and so on). Saccades after the 20th saccade were not included in this analysis, because some participants did not make that many saccades in all the conditions and, so, there were too few data points.
Fig. 10

Saccade endpoint distributions for the saccades of all participants in Experiment 1 and 2 (E1 and E2). Each column of plots shows the distribution for a different condition; rows group the saccades according to the index of the saccade in the trial. In each case, warmer colours indicate a higher frequency of saccades landing in that location

The plots reveal several interesting trends. First, there is a strong central bias in all the conditions, which occurs in the first few saccades but becomes less pronounced in later saccades. Second, the most frequently inspected points are around the image horizontal, whereas the top and bottom of the image are more likely to be neglected. Third, in general, there was a leftward bias, particularly in the first five saccades, where 69% of all the saccades landed in the left side of the image. Finally, the moving-window conditions resulted in some differences in the distribution of saccade endpoints, particularly when one looks at the 6th–10th saccade in the trial. In both the square window and vertical window conditions, there was a strong asymmetry in the plots: Saccades were more likely to move to the left of the image than to the right. This contrasts with the distribution seen in normal viewing, which is more evenly distributed, and that in the horizontal window condition, which actually seemed to produce a rightward bias in saccades 6–15.

Conclusions from Experiment 2

This experiment replicated the main finding from Experiment 1—that a vertically oriented window reliably reduced the horizontal bias—and extended that experiment in several ways. First, the effect remained in stimuli where peripheral information was blurred, rather than being completely removed, which is important because this is similar to the way that information is disrupted in natural vision. Second, the effect remained for large saccades that moved outside the window. Third, saccades outside the window were followed by shorter fixations and smaller saccades, demonstrating that decreased information at the saccade destination affected subsequent eye movements. Fourth, the gist of the scene moderated the pattern of saccade direction, even when peripheral information was masked, such that interiors led to fewer horizontal saccades and more vertical ones. Finally, although there was evidence for a central bias in the distribution of fixations, the concentration of fixations at the top and bottom of the display that was reported by Najemnik and Geisler (2008) was not seen in the task and stimuli used here.

General discussion

We investigated how the shape of the information around fixation affected some of the patterns in eye movement scanning during an unconstrained encoding task. We will begin by characterising normal scanning, before discussing the effect of different gaze-contingent windows and the implications for models of eye guidance in scenes.

Normal and gaze-contingent scanning

In Experiment 1, we replicated some of the eye movement biases that have been seen in other image-viewing tasks. There was a strong central bias in normal viewing, and this was strongest at the start of scene viewing (during the first five saccades). This is likely because the starting viewing position was constrained by the experiment to be at the centre of the screen and, presumably, as time went on, people were more likely to have moved further from the centre. Other factors that have been suggested to contribute to the central bias are the distribution of salient features or objects in the scene (photographer bias) or orbital reserve, and Tseng, Carmi, Cameron, Munoz, and Itti (2009) and Tatler (2007) have considered these factors in detail. There was also a slight leftward bias at the start of viewing, which is consistent with the results of Dickinson and Intraub (2009), who recently reported a leftward asymmetry in scene perception. The preference to move to the left side of the image was found across a range of scenes and, therefore, seems unlikely to be caused by an uneven distribution of features or objects within the scene (for further discussion of the role of image features in saccade asymmetries, see Foulsham & Kingstone, 2010).

The saccades made in normal viewing also showed biases in direction and amplitude. There was a marked tendency for making horizontal saccades, rather than vertical or oblique eye movements. Most saccades were between about 2° and 7° in amplitude, indicating that they tended to target regions on the parafovea or extending into the periphery, but horizontal saccades were longer than vertical saccades, on average. Why were there more (and longer) horizontal saccades? The bias in the present study was probably exacerbated by the fact that images (and the visible monitor) were landscape in orientation, that scanning started in the centre, and that the image was always presented in the same egocentric reference frame (meaning that the horizontal position of the eyes and biases in the movement of the extraocular muscles may have had an effect). However, we have shown previously that a horizontal bias persists even in the absence of these cues (Foulsham et al., 2008). With random start locations and square images that were rotated from their canonical orientation, that study demonstrated that the horizontal bias was scene centred, rather than egocentric.

Gaze-contingent windows had some general effects on scanning, some of which have been reported elsewhere. First, the saccades made in these conditions were shorter, on average, than those in normal viewing, confirming the influence of peripheral information on saccade guidance and suggesting that a more conservative strategy was employed that targeted features within the window. This would also explain why the gaze-contingent conditions elicited somewhat less dispersed endpoint distributions and a greater central bias: The window curtailed long saccades, so that fixations remained closer to the centre for longer. Removal of peripheral information also had a detrimental effect on memory for the scenes: It took longer to recognise scenes that had been viewed through a gaze-contingent window, consistent with a detriment in encoding in these conditions (see Saida & Ikeda, 1979).

There was mixed evidence for an effect of peripheral masking on the number or duration of fixations. In Experiment 1, some of the window conditions resulted in more fixations, with a slightly lower average duration, than did those in normal viewing, but this was not found in Experiment 2. van Diepen and d'Ydewalle (2003) also found an increased number of fixations with peripheral masking of line drawings of scenes, although this study reported an increase in fixation durations under these conditions. Loschky and McConkie (2002) also reported longer fixation durations in viewing with a low-pass-filtered periphery, which we did not find here, perhaps because we used larger windows. This discrepancy might also occur because, in our study, viewing was limited by a fixed trial duration. The increased difficulty of the gaze-contingent encoding in Experiment 1 was reflected in an increase in the number of fixations, perhaps because each object or region of interest had to be fixated multiple times. This interpretation is consistent with a sequential model of attention in scene perception where fixation durations reflect processing close to fixation and are relatively unaffected by peripheral information. On the other hand, having low-resolution information in the periphery (Experiment 2) was sufficient for eliciting fixations that were not significantly more frequent or longer than in normal viewing.

The effect of window shape

In the introduction, we offered two possible hypotheses for how a horizontal and vertical window would change patterns in scanning direction. First, if saccades were guided in order to maximize the information gained on each fixation (defined as revealing new areas of the scene), a horizontal window would lead to more vertical saccades in order to avoid previously seen regions, with the opposite being true in the case of a vertical window. Second, if saccades were guided towards features that were currently visible, a horizontal window would lead to more horizontal saccades. Our findings point unanimously to the latter explanation: A vertical window produced more vertical saccades than in the other conditions, even though there were fewer unseen areas to be explored by moving up and down. This difference between window conditions was found even on the very first saccade. The pattern of saccade amplitudes was also systematically related to the dimensions of the window: Saccades with an amplitude and direction matching the boundary of the window were made frequently. We can be confident that these findings are not artefacts of windows with straight edges and a completely masked periphery, because the findings were replicated in Experiment 2 with elliptical apertures and a blurred periphery. In addition, there was no evidence for a trade-off in terms of the horizontal bias and a tendency to fixate at the top and bottom of the display, as was found by Najemnik and Geisler (2008). In fact, the top and bottom of the scene were relatively neglected in all conditions.

Several conclusions can be made on the basis of the saccade direction and amplitude results. First, because a bias for horizontal saccades persisted even with a square window (where visible information was equal in all directions), this bias must be partly driven by experience or knowledge about landscape-oriented images and monitors. Second, this default bias for horizontal eye movements was modified by an asymmetric window, consistent with a strategy of targeting points that could already be seen within the high-resolution window. In the vertical window condition, there were more features above and below fixation, and fewer to the left or right, and so the higher frequency of vertical saccades might reflect a tendency for people to move towards the information within the window.

A potential problem with this interpretation is that there was a significant number of saccades that were large enough to be targeted outside the window. How were these saccades controlled, and how did masking and blurring affect their occurrence? Loschky and McConkie (2002) also found that the radius of a gaze-contingent window shortened saccades, and they interpreted this pattern of results as evidence that peripheral filtering reduces the saliency of points outside the window, making them less likely to win the competition for the next saccade. Surprisingly, in our study, saccades were no more likely to move outside the window when the periphery was blurred than when it was completely masked. This further supports the argument that, when given the choice between high-resolution information and masked or above-threshold filtered information, the eye movement system tends to saccade within the window. Furthermore, even long saccades outside the window tended to go in the direction of the elongated boundary. One possibility is that these saccades target features on or near the window boundary but overshoot this destination, due to noise in the saccadic system. This would predict a distribution of amplitudes with the mode at the radius of the boundary, which is similar to what we find. It is also possible that participants were driven to “follow” partially seen objects or details which extended into the masked space, therefore allowing them to make a reasonable prediction about what was there before they planned their saccade.

Another possibility is suggested by the properties of the preceding and following fixations and of the following saccade. It has been suggested that eye movement events in scene viewing can be divided into local clusters of exploratory fixations of long duration with short saccades, separated by larger amplitude, “global” shifts to a new region (Unema et al., 2005). We found that saccades outside the window were followed by shorter fixations and smaller amplitude saccades than were saccades within the window. This suggests that saccades outside the window may have been qualitatively different, global shifts which moved the eyes from one period of local scanning (within the window) to another. Why, then, was the subsequent fixation atypically brief? It is likely that, because the information at this point was degraded at the start of the saccade, its positioning was suboptimal and, therefore, participants terminated the fixation early and made a small re-adjustive saccade. It may be that viewing with a gaze-contingent window exaggerates the local/global viewing strategy, and this would be worth exploring in further research.

Implications for natural scene viewing

An additional point of interest concerns the relationship between the gaze-contingent conditions and normal viewing. Across several measures, a horizontal window led to less of a difference from normal scanning than did a vertical or square window. Specifically, in Experiment 1, a horizontal window had less of an impact on mean saccade amplitude and on the direction and amplitude distributions than did a vertical or square window. This was also true in Experiment 2, and it suggests that viewing with a horizontal window was less disruptive and more normal for participants. The implication here is that during normal vision, the visible region most important for eye guidance is elongated in the horizontal direction. This is consistent with the visibility maps measured by Najemnik and Geisler (2005), albeit in a rather different task.

Although the results emphasise the importance of currently visible features in our task, we should be cautious about making further claims about human eye movement strategies in natural viewing. Window shape is only one of several factors in determining the pattern of saccade directions, and we also confirmed that scene type makes a difference. More horizontal saccades were made when landscapes were viewed than when interiors were viewed, probably because interesting features were arranged along the horizontal, and this did not interact with window shape. The lack of information-seeking saccades may have occurred because participants did not have experience and knowledge about the visibility of the artificial windows and, so, resorted to a more feature-driven approach. It is highly likely that saccade targeting is based on multiple sources of information that can be weighted differently (see Brouwer & Knill, 2007, for an example of this cue integration in visually guided reaching). Therefore, the challenge for modelling is to combine the drive towards currently visible features and the desire to maximize information in a way that explains the present data. For example, perhaps locations close to the boundary of the window (which were frequently fixated in the present study) represent a trade-off between targeting visible features and revealing new information. This is another interesting avenue for future study.

In conclusion, the experiments reported here point to two important principles regarding the control of saccades in scene viewing. First, rather than aiming only to maximize the information on each saccade, the eye movement system in our encoding task targeted features within the regions of foveal or parafoveal visibility. Second, this region is elongated horizontally, which may be an important factor in the horizontal saccade bias that has been observed in natural scene perception. With this foundation in place, the opportunity for future investigations is vast. In addition to questions of local/global exploration and the modelling of saccade targeting, researchers can use the gaze-contingent window to examine the role of bottom-up and top-down features in search, scanning in different tasks and how changes in window shape might be used to enhance exploration in patient populations, such as those with left-side neglect.

Notes

Acknowledgement

This work was supported by NSERC grants to A.K. and a Commonwealth Postdoctoral fellowship to T.F. from the Government of Canada.

References

  1. Brouwer, A.-M., & Knill, D. C. (2007). The role of memory in visually guided reaching. Journal of Vision, 7, 1–12.Google Scholar
  2. Buswell, G. T. (1935). How people look at pictures: A study of the psychology of perception in art. Chicago: University of Chicago Press.Google Scholar
  3. Chen, X., & Zelinsky, G. J. (2006). Real-world visual search is dominated by top-down guidance. Vision Research, 46, 4118–4133.CrossRefPubMedGoogle Scholar
  4. Dickinson, C. A., & Intraub, H. (2009). Spatial asymmetries in viewing and remembering scenes: Consequences of an attentional bias? Attention, Perception, & Psychophysics, 71, 1251–1262.CrossRefGoogle Scholar
  5. Einhauser, W., Rutishauser, U., & Koch, C. (2008). Task-demands can immediately reverse the effects of sensory-driven saliency in complex visual stimuli. Journal Of Vision, 8((2, Art. 2)), 1–19.Google Scholar
  6. Foulsham, T., & Kingstone, A. (2010). Asymmetries in the direction of saccades during perception of scenes and fractals: Effects of image type and image features. Vision Research, 50, 779–795.CrossRefPubMedGoogle Scholar
  7. Foulsham, T., Kingstone, A., & Underwood, G. (2008). Turning the world around: Patterns in saccade direction vary with picture orientation. Vision Research, 48, 1777–1790.CrossRefPubMedGoogle Scholar
  8. Foulsham, T., & Underwood, G. (2007). How does the purpose of inspection influence the potency of visual saliency in scene perception? Perception, 36, 1123–1138.CrossRefPubMedGoogle Scholar
  9. Foulsham, T., & Underwood, G. (2008). What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition. Journal Of Vision, 8(6), 1–17.CrossRefPubMedGoogle Scholar
  10. Geisler, W. S., Perry, J. S., & Najemnik, J. (2006). Visual search: The role of peripheral information measured using gaze-contingent displays. Journal Of Vision, 6(9), 858–873.CrossRefPubMedGoogle Scholar
  11. Greene, H. H. (2006). The control of fixation duration in visual search. Perception, 35, 303–315.CrossRefPubMedGoogle Scholar
  12. Itti, L., & Koch, C. (2000). A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research, 40, 1489–1506.CrossRefPubMedGoogle Scholar
  13. Kanan, C.M., Tong, M. H., Zhang, L., & Cottrell, G. W. (2009). SUN: Top-down saliency using natural statistics. Visual Cognition, 17(6 & 7), 979–1003Google Scholar
  14. Larson, A. M., & Loschky, L. C. (2009). The contributions of central versus peripheral vision to scene gist recognition. Journal Of Vision, 9(10, Art. 6), 1–16.CrossRefPubMedGoogle Scholar
  15. Loschky, L. C., & McConkie, G. W. (2002). Investigating spatial vision and dynamic attentional selection using a gaze-contingent multiresolutional display. Journal Of Experimental Psychology: Applied, 8, 99–117.CrossRefPubMedGoogle Scholar
  16. Loschky, L. C., McConkie, G. W., Yang, H., & Miller, M. E. (2005). The limits of visual resolution in natural scene viewing. Visual Cognition, 12, 1057–1092.CrossRefGoogle Scholar
  17. Loschky, L. C., & Wolverton, G. S. (2007). How late can you update gaze-contingent multiresolutional displays without detection? Acm Transactions on Multimedia Computing Communications and Applications, 3(4), 1–10.CrossRefGoogle Scholar
  18. McConkie, G. W., & Rayner, K. (1975). Span of effective stimulus during a fixation in reading. Perception & Psychophysics, 17, 578–586.Google Scholar
  19. Najemnik, J., & Geisler, W. S. (2005). Optimal eye movement strategies in visual search. Nature, 434, 387–391.CrossRefPubMedGoogle Scholar
  20. Najemnik, J., & Geisler, W. S. (2008). Eye movement statistics in humans are consistent with an optimal search strategy. Journal of Vision, 8(3, Art. 4), 1–14.CrossRefPubMedGoogle Scholar
  21. Navalpakkam, V., & Itti, L. (2005). Modeling the influence of task on attention. Vision Research, 45, 205–231.CrossRefPubMedGoogle Scholar
  22. Parkhurst, D., Law, K., & Niebur, E. (2002). Modeling the role of salience in the allocation of overt visual attention. Vision Research, 42, 107–123.CrossRefPubMedGoogle Scholar
  23. Peters, R. J., Iyer, A., Itti, L., & Koch, C. (2005). Components of bottom-up gaze allocation in natural images. Vision Research, 45, 2397–2416.CrossRefPubMedGoogle Scholar
  24. Pomplun, M., Reingold, E. M., & Shen, J. Y. (2001). Peripheral and parafoveal cueing and masking effects on saccadic selectivity in a gaze-contingent window paradigm. Vision Research, 41, 2757–2769.CrossRefPubMedGoogle Scholar
  25. Rao, R. P. N., Zelinsky, G. J., Hayhoe, M. M., & Ballard, D. H. (2002). Eye movements in iconic visual search. Vision Research, 42, 1447–1463.CrossRefPubMedGoogle Scholar
  26. Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372–422.CrossRefPubMedGoogle Scholar
  27. Rayner, K. (2009). Eye movements and attention in reading, scene perception, and visual search. Quarterly Journal of Experimental Psychology, 62, 1457–1506.CrossRefGoogle Scholar
  28. Reinagel, P., & Zador, A. M. (1999). Natural scene statistics at the centre of gaze. Network-Computation In Neural Systems, 10, 341–350.CrossRefGoogle Scholar
  29. Reingold, E. M., Loschky, L. C., McConkie, G. W., & Stampe, D. M. (2003). Gaze-contingent multiresolutional displays: An integrative review. Human Factors, 45, 307–328.CrossRefPubMedGoogle Scholar
  30. Saida, S., & Ikeda, M. (1979). Useful visual-field size for pattern perception. Perception & Psychophysics, 25, 119–125.Google Scholar
  31. Shioiri, S., & Ikeda, M. (1989). Useful resolution for picture perception as a function of eccentricity. Perception, 18, 347–361.CrossRefPubMedGoogle Scholar
  32. Tatler, B. W. (2007). The central fixation bias in scene viewing: Selecting an optimal viewing position independently of motor biases and image feature distributions. Journal of Vision, 7(14, Art. 4), 1–17.CrossRefPubMedGoogle Scholar
  33. Tatler, B. W. (2009). Eye guidance in natural scenes [Special issue]. Visual Cognition, 17(6/7)Google Scholar
  34. Tatler, B. W., & Vincent, B. T. (2009). The prominence of behavioural biases in eye guidance. Visual Cognition, 17, 1029–1054.CrossRefGoogle Scholar
  35. Torralba, A., Oliva, A., Castelhano, M. S., & Henderson, J. M. (2006). Contextual guidance of eye movements and attention in real-world scenes: The role of global features in object search. Psychological Review, 113, 766–786.CrossRefPubMedGoogle Scholar
  36. Tseng, P. H., Carmi, R., Cameron, I. G. M., Munoz, D. P., & Itti, L. (2009). Quantifying center bias of observers in free viewing of dynamic natural scenes. Journal of Vision, 9((7, Art. 4)), 1–16.Google Scholar
  37. Underwood, G., Jebbett, L., & Roberts, K. (2004). Inspecting pictures for information to verify a sentence: Eye movements in general encoding and in focused search. Quarterly Journal of Experimental Psychology, 57A, 165–182.Google Scholar
  38. Unema, P. J. A., Pannasch, S., Joos, M., & Velichkovsky, B. M. (2005). Time course of information processing during scene perception: The relationship between saccade amplitude and fixation duration. Visual Cognition, 12, 473–494.CrossRefGoogle Scholar
  39. van Diepen, P. M. J., & d'Ydewalle, G. (2003). Early peripheral and foveal processing in fixations during scene perception. Visual Cognition, 10, 79–100.CrossRefGoogle Scholar
  40. Yarbus, A. L. (1967). Eye movements and vision. New York: Plenum.Google Scholar
  41. Zelinsky, G. J. (2008). A theory of eye movements during target acquisition. Psychological Review, 115, 787–835.CrossRefPubMedGoogle Scholar

Copyright information

© Psychonomic Society, Inc. 2010

Authors and Affiliations

  1. 1.Department of PsychologyUniversity of British ColumbiaVancouverCanada

Personalised recommendations