Attention, Perception, & Psychophysics

, Volume 78, Issue 2, pp 602–617 | Cite as

On the optimal viewing position for object processing

  • Lotje van der Linden
  • Françoise Vitu


Numerous studies have shown that a visually presented word is processed most easily when participants initially fixate just to the left of the word's center. Fixating on this optimal viewing position (OVP) results in shorter response times and a lower probability of making additional within-word refixations (OVP effects), but also longer initial-fixation durations (an inverted-OVP or I-OVP effect), as compared to initially fixating at the beginning or the end of the word. Thus, typical curves are u-shaped (or inverted-u-shaped), with a leftward bias. Most researchers explain the u-shape in terms of visual constraints, and the leftward bias in terms of language constraints. Previous studies have demonstrated that (I)-OVP effects are not specific to words, but generalize to object viewing. We further investigated this by comparing the strength and (a)symmetry of (I-)OVP effects for words and objects. To this purpose, we gave participants an object- versus word-naming task in which we manipulated the position at which they initially fixated the stimulus (i.e., a line drawing or the written name of an object). Our results showed that object viewing, just as word viewing, resulted in u-shaped (I-)OVP curves. However, the effect was weaker than for words. Furthermore, for words, the curves were biased to the left, whereas they were symmetrical for objects. This might indicate that part of the (I-)OVP effect for words is language specific, and that (I-)OVP effects for objects are a purer measure of the effect of visual constraints.


Eye movements Optimal viewing position Object identification Refixations Word identification Fixation location Fixation duration 


During reading, the eyes tend to land at the center of short words, and slightly to the left of the center of long words, at least for languages that are read from left to right (Rayner, 1979). Interestingly, this so-called preferred-viewing location (PVL) generalizes to visual tasks other than reading. For example, Henderson (1993) and Foulsham and Underwood (2009) showed that the eyes also preferentially land at the center of isolated objects. More recently, Nuthmann and Henderson (2010) and Foulsham and Kingstone (2013) showed similar PVL effects when objects were embedded in complex, naturalistic visual scenes. Thus, when plotting the distributions of saccadic landing positions on words or objects, on average, the peak is near the center of the stimuli. As further developed below, previous studies have shown that initial-fixation positions on isolated words, as well as on words in sentences, influence subsequent task performance and eye movements. The same is true for isolated objects, and, to some extent, for objects in scenes.

The purpose of the current study was to directly compare these initial-fixation effects between isolated words and objects. We did this by experimentally manipulating where on these stimuli the participant initially fixated. The systematic comparison between objects and words aimed at further understanding the mechanisms underlying initial-fixation effects, and determining to what extent they are specific to language processing.

The optimal viewing position for word processing

The ease with which an isolated written word is processed depends on where the eyes initially fixate. In languages that are read from left to right, words are processed most efficiently when participants initially fixate at the center of the word, or just to the left of it (O’Regan, Lévy-Schoen, Pynte, & Brugaillère, 1984; for reviews see Brysbaert & Nazir, 2005; Rayner, 1998; Vitu, 2011). Such an optimal viewing position (OVP) results in faster (O’Regan & Jacobs, 1992; O’Regan et al., 1984) and more accurate (Brysbaert, Vitu, & Schroyens, 1996; Nazir, O’Regan, & Jacobs, 1991) word identification as compared to when the eyes initially fixate either extreme end of the word. Thus, plotting response times as a function of initial fixation position reveals an asymmetric u-curve that is biased to the left of the word's center. The proportion of correct responses shows the reverse pattern (i.e., an inverted u-shape). We refer to these as the response-time-OVP (RT-OVP) effect, and the accuracy-OVP effect, respectively.

Interestingly, initial fixation position also affects eye movements. When participants initially fixate a word's center, they make less within-word refixations than when they initially fixate one of the word's ends. Thus, when plotting refixation probability as a function of initial-fixation position, this again results in an asymmetric u-shaped curve with a slight bias to the left of the word's center. This refixation-OVP effect was initially demonstrated in an isolated word-recognition task (O’Regan & Lévy-Schoen, 1987). Many later studies demonstrated that it generalizes to natural reading (McConkie, Kerr, Reddix, Zola, & Jacobs, 1989; Nuthmann, Engbert, & Kliegl, 2005; Rayner, Sereno, & Raney, 1996), though in a weaker manner (Vitu, O’Regan, & Mittau, 1990).

Fixation durations demonstrate a reverse pattern. When participants initially fixate on the optimal position, the duration of the initial fixation is longer than when participants initially fixate the extreme ends of a word. This is the case regardless of whether the initial fixation is followed by a within-word refixation or not, and results in an inverted u-curve, again with a slight bias to the left of the word's center. We refer to this as the fixation-duration inverted OVP (I-OVP) effect. Just like the refixation-OVP effect, the fixation-duration I-OVP effect is observed in isolated-word paradigms (Vitu, Lancelin, & Marrier d’Unienville, 2007; see also O’Regan & Lévy-Schoen, 1987) as well as during natural reading (Hyönä & Bertram, 2011; Nuthmann et al., 2005; Vitu, McConkie, Kerr, & O’Regan, 2001).

Underlying mechanisms

Although RT-OVP, accuracy-OVP, refixation-OVP, and fixation-duration I-OVP effects are robust phenomena, researchers debate the mechanisms that underlie the u- (or inverted-u-)shape and the leftward asymmetry of (I-)OVP curves. The asymmetry can refer to the curve's optimum being shifted to the left of the word’s center, or to the curve being lower (or higher, for the I-OVP effect) at the left part of the word. In either case, initial-viewing position effects are stronger to the right than to the left of a word's center.

Visual constraints

Most researchers agree that at least part of the u-shape of the accuracy-OVP curve is explained by visual constraints (Nazir, Jacobs, & O’Regan, 1998; Nazir et al., 1991; see also McConkie et al., 1989). Letter visibility is limited by the rapid drop of visual acuity from the center of the fovea (Levi, Klein, & Aitsebaomo, 1985; Nazir, Heller, & Sussmann, 1992). At an eccentricity of 1° (typically corresponding to three to four letters), visual acuity is already reduced by 60 % (Wertheim, 1894, as cited by Brysbaert & Nazir, 2005). Thus, central fixations benefit maximally from high-acuity foveal vision. Fixating the first letter of a word is suboptimal, because part of the high-acuity foveal vision is wasted on the white space preceding the word, whereas the last part of the word falls outside of foveal vision. The same reasoning holds for fixations on the last letter.

Furthermore, letters are more difficult to recognize when they are surrounded by other letters (as is the case in written words) as compared to when they are presented in isolation. This effect, which is even stronger when letters are presented further in the periphery (Bouma, 1970), is referred to as crowding (Pelli, 2008) or lateral masking (Bouma, 1970).

Language constraints

If only these low-level visual factors would play a role, the resulting curves should be symmetrical. Yet, they are not. As reviewed above, they show a slight bias to the left. Several explanations have been proposed for this leftward bias.

Firstly, Brysbaert (1994, 2004) argued that part of the leftward bias can be explained by cerebral dominance. Information presented in the right visual field is directly projected to the left hemisphere, which, for the majority of people, is specialized in language processing. In contrast, visual information presented in the left visual field requires inter-hemispheric transfer to reach the dominant left hemisphere. This results in a right-visual-field advantage for word processing (Bryden, 1982; Hellige, 1990). Therefore, word recognition is easier when a larger part of the word falls to the right of fixation, that is, when the eyes fixate just to the left of the word's center (Brysbaert, 1994; Brysbaert et al., 1996).

Following this reasoning, a different pattern should be observed for the minority of readers for whom the right hemisphere is dominant for language processing. This is indeed the case. Van der Haegen and colleagues (2013) showed that right-hemisphere-dominant readers were faster to read aloud words when initially fixating towards the right as compared to the left of the word's center. The left-hemisphere-dominant control group showed the classical pattern: faster responses when initially fixating to the left of the center (see also Brysbaert, Cai, & van der Haegen, 2012).

Secondly, the leftward bias might be a consequence of reading direction (Farid & Grainger, 1996; Nazir, 2000; Nazir, Ben-Boutayab, Decoppet, Deutsch, & Frost, 2004). In Western languages, readers read from left to right. In line with this, the region from which useful letter information is obtained during fixations on letter strings, is asymmetric: To the left of fixation, only a few (about four) letters are used, whereas to the right of fixation up to 10–15 letters are used (McConkie & Rayner, 1976; Rayner, Well, & Pollatsek, 1980; Underwood & McConkie, 1985). This asymmetry in perceptual span is typically interpreted as a consequence of attention being biased towards the direction of reading. In contrast, Nazir and colleagues (2000, 2004) argued that reading habits make that words are more frequently recognized in the right, compared to the left, visual field. This, in turn, has trained the visual system to adapt a bias in favor of letters to the right of fixation. Regardless of the underlying mechanism, reading direction is thought to contribute to the asymmetry of accuracy-OVP curves. Partial evidence for this comes from studies with right-to-left readers as participants. For example, Farid and Grainger (1996) showed that in Arabic, the OVP is at the word's center, with no leftward bias.

A final explanation is provided by Clark and O’Regan’s (1999) lexical-ambiguity hypothesis, also referred to as the information-distribution hypothesis (see also Stevens & Grainger, 2003). These authors modeled OVP effects by combining a simplified estimate of letter visibility with lexical ambiguity. To obtain their simplified measure of letter visibility, the authors reasoned that due to visual acuity and crowding, the four (most) visible letters in a word are the two letters that are closest to fixation plus the two outer letters. (The two outer letters suffer less from crowding because they are not surrounded by other letters on both sides.) Next, as an estimate of word ambiguity, they calculated how many words in the dictionary are compatible with these four letters. When plotting mean ambiguity as a function of fixation position, the curves resembled the classical OVP curves. This suggests that fixating just left to the word's center leads to the smallest lexical ambiguity. This is in line with the fact that, on average, the beginning of a word contains more information (i.e., is more unique) than the end of a word (Holmes & O’Regan, 1987; O’Regan et al., 1984).

Interestingly, Clark and O'Regan's lexical-ambiguity hypothesis not only explains the leftward bias, but also the u-curve itself. Even though their model contains some minimal visual constraints, the obtained u-curves are much steeper than would be on the basis of letter visibility only. Therefore, the authors do not attribute the OVP effect to visual constraints only. Instead they argue that the lexical information provided by the visible letters causes (most of) the effect.

In sum, the above described hypotheses are all related to language. They are not mutually exclusive. The leftward bias is probably the result of the interplay between cerebral dominance, reading habits and lexical constraints (for a similar conclusion, see Brysbaert, 2004; Brysbaert & Nazir, 2005; Whitney, 2001). Lexical ambiguity not only contributes to the asymmetry but also to the u-shape of the OVP effect.

Oculomotor strategies

As reviewed above, both visual and language constraints determine which position is optimal for viewing a word. It is not surprising that fixating on this position, as compared to other positions, reduces the need to refixate a word. Yet, the inverted OVP (I-OVP) effect on initial-fixation duration deserves some explanation because it might appear contradictory under the classical hypothesis that fixation duration is an index of processing difficulty. (For example, fixation durations on low frequent words are longer than fixation durations on high frequent words; Inhoff & Rayner, 1986).

To explain the fixation-duration I-OVP effect, Vitu and colleagues (2001, 2007) proposed a perceptual-economy account. If observers' eyes are at a position that they have previously experienced as being optimal for word identification, they tend to keep their eyes at this optimal position for a longer time. Since word identification occurs in parallel with perceptual processing, this, in turn, decreases the chance that participants will make a refixation before having identified the word. The latter explains the refixation-OVP curve.

Alternatively, Nuthmann and colleagues (2005, 2007) proposed that the fixation-duration I-OVP effect results from the fact that sometimes the eyes under- or overshoot the intended word while moving along the lines of text. To correct for such mislocated fixations, which tend to be towards the end of the preceding or the beginning of the next word, the eyes rapidly move away from this unintended position, resulting in very brief initial-fixation durations. This mechanism may account for fixation-duration I-OVP curves in normal reading, where mislocations are possible, but not for I-OVP curves in isolated-word paradigms, where initial-fixation position is experimentally imposed (O’Regan & Lévy-Schoen, 1987). In case of the latter, Nuthmann and colleagues (2007) suggest that I-OVP curves may be due to a more detailed error-correction mechanism (not at work during natural reading) that responds to small deviations of the eyes relative to the word's center. Thus, even though I-OVP curves are found in both continuous reading and isolated-word reading, their underlying mechanism might be different.

The optimal viewing position for object viewing

As described above, most researchers agree that the OVP effect for word processing has both a visual and a language component. This raises the question of whether the OVP effect also holds for objects. The decrease of visual acuity with eccentricity is due to the distribution of photoreceptors on the retina. Thus, its influence should not be specific to words, but should transfer to other visual stimuli such as objects. The same is likely true for crowding. As suggested by Pelli and Tillman (2008), crowding is not specific to letter strings. Object viewing also suffers from crowding (Wallace & Tjan, 2011). Most interestingly for the current study, even the different features that together constitute a single object may crowd each other. For example, during face viewing, the visibility of the nose suffers from crowding effects from the flanking eyes (Martelli, Majaj, & Pelli, 2005). The same might hold for features of other objects (reviewed in Pelli & Tillman, 2008, Fig. 5).

Thus, for objects, just as for words, initial-fixation position should affect performance and gaze behavior, leading to u-shaped (or inverted-u-shaped) (I-)OVP curves. Several previous studies tested this idea (Foulsham & Kingstone, 2013, Experiment 1; Foulsham & Underwood, 2009; Henderson, 1993). For example, Henderson (1993) recorded eye movements while participants viewed an array of black-and-white line drawings in preparation for a memory task, and found a refixation-OVP effect and a fixation-duration I-OVP effect. This was a first indication of (I-)OVP effects on object processing, analogous to (I-)OVP effects in word processing.

More recently, Foulsham and Kingstone (2013) presented participants with isolated photographs of objects that were cut from images of natural scenes (Experiment 1). Participants had to make a speeded button press to distinguish real objects from non-objects (formed by scrambled pieces of real objects). The researchers manipulated the position at which participants initially fixated the objects. As predicted, performance was influenced by this manipulation. When participants fixated the center of the object, they were faster to identify the object as compared to when their eyes were fixating the object's boundary (an RT-OVP effect). Furthermore, as in Henderson's (1993) study, eye movements showed a refixation-OVP effect and a fixation-duration I-OVP effect. The authors interpreted this as evidence for the existence of an optimal viewing position for object viewing, reflecting visual constraints.

Whether refixation-OVP effects and fixation-duration I-OVP effects generalize to objects embedded in natural scenes, remains debated. Pajak and Nuthmann (2013) found an I-OVP for large objects (3.15–6°), but not for medium (.85–3.15°) and small (1–1.85°). Foulsham and Kingstone (2013, Experiment 2) did not observe an I-OVP effect. However, they did not investigate the effect of stimulus size. Therefore, it might be possible that large objects in their study did yield an I-OVP effect, whereas small objects did not. This may have prevented the overall effect from reaching significance. Regarding the refixation-OVP effect, the two studies provided conflicting results: Pajak and Nuthmann did not find an effect of initial fixation location on refixation probability, while Foulsham and Kingstone observed an inverted u-curve, thus suggesting a trend opposite to the typical refixation-OVP effect. We will briefly come back to the issues of (I)-OVP effects for objects in natural scenes in the Discussion.

The current paper focuses on isolated words and objects. As mentioned above, RT-OVP, refixation-OVP, and fixation-duration I-OVP effects generalize from isolated words to isolated objects. Yet it remains undetermined whether the effects are quantitatively comparable for objects and words. Word and object (I-)OVP curves might differ in strength and asymmetry. With regard to the strength of the effects, Clark and O'Regan (1999) proposed that part of the (inverted) u-shape of (I-)OVP curves in words can be explained by lexical ambiguity only. The effect of lexical ambiguity is not directly applicable to object viewing. At least, there is no reason to suppose that diagnostic information in objects is distributed in a systematic way, as is the case for words.1 Thus, the effect of word ambiguity (present for words, absent for objects), might make the effect of initial-fixation position weaker for objects than for words.

Regarding the asymmetry of the (I-)OVP curves, previous studies on (I-)OVP effects in isolated objects and objects in scenes did not report a leftward bias, although none of these studies appear to have explicitly tested the (a)symmetry of the curves. On the one hand, the leftward bias might be specific for word processing, and hence be absent in objects. Indeed, Clark and O'Regan's lexical-ambiguity hypothesis explains the leftward bias by the fact that, on average, words contain more information at the word's beginning than at the word's end. Again, there is no reason to assume that this is the case for objects as well, at least not when the orientation of the objects (e.g., the position of the head of animals, etc.) is not manipulated. Similarly, the effects of reading habits and hemispheric dominance are so intrinsically related to language, that they might only apply to words, and not to objects.

On the other hand, some indirect evidence suggests that the leftward bias might transfer to object viewing. Firstly, Brysbaert's (1994) hemispheric-dominance hypothesis is partly based on findings from visual half-field studies (showing a right-visual-field advantage of word processing), which he stretches to OVP studies (showing a right-side advantage for word processing, see Brysbaert, 1994; Brysbaert et al., 1996). Several studies have also demonstrated a right-visual-field advantage for object naming (Hunter & Brysbaert, 2008). Thus, analogously, naming foveally presented objects might be easier when the majority of the stimulus falls into the right visual field (which directly projects to the left hemisphere), as compared to the left visual field (which requires inter-hemispheric transfer). Of course, this would only hold for object-naming tasks (as used in our current study), where language production is heavily involved.

Secondly, Nazir and colleagues' (2004) argued that reading habits have shaped the way the visual system processes stimuli in general. For example, reading direction influences how observers scan faces: Left-to-right readers show a leftward bias, whereas right-to-left readers show a rightward bias (Heath, Rouhana, & Abi Ghanem, 2005; Hsiao & Cottrell, 2008; Vaid & Singh, 1989). Thus, due to either hemispheric dominance or reading habits, (I-)OVP effects in objects might be asymmetric for objects, just as they are for words.

The current study

The purpose of the current study was to directly compare the shape and asymmetry of (I-)OVP curves for words and objects, in order to determine whether part of the OVP phenomenon is indeed specific to language. To this end, we gave participants both an object-naming and a word-naming task. Although similar to Foulsham and Kingstone's (2013) Experiment 1, our study complemented this previous work in two ways: Firstly, we compared object viewing with word viewing, and secondly, we used a different task (a verbal naming task instead of a manual decision task).

To manipulate the participant's initial fixation position, we presented words and objects at variable locations relative to a previously displayed fixation dot. To optimize the comparison between both stimulus types, we matched several possibly confounding properties as much as possible. Firstly, we used object-word pairs from a normative stimulus set (Rossion & Pourtois, 2004), such that the required correct verbal response was the same for both types of stimuli (e.g., participants had to respond “chat,” which is French for cat, after seeing the written word /chat/ and after seeing a picture of a cat). Secondly, we controlled the visual properties of both stimulus types as outlined in the section below. For example, we made sure that the width of the word and the object were matched within pairs (e.g., the cat pair). Thus, any potential difference in OVP curves between both stimulus types (e.g., flatter for objects than for words) could not be explained by their overall width (e.g., wider objects than words).

To foreshadow the results, we found u-shaped (I-)OVP curves for both objects and words, although the effects were weaker for objects. Furthermore, (I-)OVP curves for words showed a bias towards the left, whereas the (I-)OVP curves for objects were symmetric.



Thirty observers participated in the experiment. All were right-handed, had normal or corrected-to-normal vision, were naive as to the purpose of the experiment, and had French as their first native language. They received payment (€10 per hour) in return for their participation and gave their written informed consent.


Participants sat in front of a computer screen in a dimly-lit room. Stimulus presentation was controlled by OpenSesame (Mathôt, Schreij, & Theeuwes, 2012) on a 21-in CRT monitor with a resolution of 1,024 × 768 pixels and a refresh rate of 100 Hz. The distance between the participant's eyes and the monitor was 75 cm and was kept constant by stabilizing the participant's head with a chin rest. Vocal responses were collected with a mono-channel microphone sampling at 44,100 Hz. Eye-position data of the right eye were recorded with a remote EyeLink 1000 system (SR Research Ltd., Mississauga, Ontario, Canada) with a sampling rate of 1,000 Hz (accuracy: 0.5°; precision: 0.01° RMS). Viewing was binocular.


Our stimulus source was the set of 260 line drawings created by Rossion and Pourtois (2004). Because our goal was to directly compare (I-)OVP effects between words and objects, we took great care in matching words and objects on several properties.

First, we excluded all object-word pairs of which the written name contained less than four or more than eight letters. Next, we rotated some of the line drawings, such that their orientation was as horizontal as possible (see Fig. 1a). We did this only when the rotation did not harm the ecological validity of the stimulus (e.g., we rotated fruits, vegetables, and utensils, but not a burning candle). Then, we cropped the selected stimuli on the basis of a just-fitting rectangular bounding box around the actual line drawings. From these cropped line drawings, we only selected the ones with an aspect (width:height) ratio of 1 or higher. By doing so, we excluded line drawings of which the height was larger than the width (see Fig. 1b).
Fig. 1

Stimulus preparation and selection (see main text for explanation). It is of note that, although here depicted in gray scale, the stimuli were colored

Furthermore, we determined the y coordinate of the part of the stimulus where the contrast with the white background was maximal (see dotted lines in Fig. 1c), in order to vertically align the stimulus accordingly during stimulus presentation. This minimized the chance that there was no visual information at the initial fixation position (see Fig. 2). We excluded line drawings for which the alignment required a vertical shift of more than 25 % of the object's height (see Fig. 1c). (If there were multiple maximum-contrast peaks, we considered the one closest to the vertical object center).
Fig. 2

The left panel shows the trial sequence for the word-naming and object-naming block, respectively. The right panel shows the five levels of Displacement

Next, we piloted the remaining object-word pairs on native French-speaking colleagues, and excluded a few line drawings for which the ‘correct response’ (according to Rossion & Pourtois' stimulus set) appeared ambiguous. This resulted in a final selection of 105 object-word pairs.

We created bitmaps of the written words (lower case, Courier New, font size 48, character space = 0.85°) that corresponded to the line drawings, and cropped the resulting bitmaps on the basis of a best-fitting rectangular bounding box around the letter string. Then, we scaled the line drawings such that their width matched the width of the corresponding written-word bitmap (see Fig. 1d). Because we included only 4-, 5-, 6-, 7-, and 8-letter words (and presented them in mono-space font), this resulted in five possible widths (see Table 1). For words, the mean height was 1° (SD = 0.14°, range 0.65–1.18°). For objects, the height was much more variable (M = 2.99°, SD = 1.50°, range 0.32–6.74°). The mean printed frequency (according to the Lexique 3.8 data base, New, Brysbaert, Veronis, & Pallier, 2007; see also New, Pallier, Brysbaert, & Ferrand, 2004; New, Pallier, Ferrand, & Matos, 2001) of the words was 50.25 occurrences per million (SD = 111.27, range 0.54–788.72).
Table 1

Width of the stimulus (identical for words and objects) as a function of the length of the corresponding word

Word length


4 letters


5 letters


6 letters


7 letters


8 letters


Data and stimuli, where possible given license restrictions, are available from the first author's website:


We manipulated the initial fixation position at which participants initially viewed the stimulus. The stimulus was displaced with an eccentricity of −50, −25, 0, 25, or 50 % of the width of the stimulus relative to a previous fixation point (see also “Procedure”). In the remainder of the paper, we will use proportions of width, rather than percentages, to describe the displacement relative to the stimulus' center. Thus, a displacement of −0.5 indicates that the eyes initially fixated the extreme left border of the stimulus, 0 indicates that the eyes initially fixated the middle of the stimulus, etc. (see Fig. 2, right panel).

Furthermore, we manipulated the factor Stimulus Type, such that all participants saw all stimuli once as a letter string (e.g., the written word "crayon," which is French for pencil) and once as an object (i.e., a line drawing of a pencil). This resulted in a 5 × 2 within-subjects design. Displacement was varied within blocks using a Latin-square design, such that each participant saw each word and object only once, but all stimuli appeared at all locations across participants. Stimulus Type was manipulated between blocks (see Fig. 2, left panel).

The experiment consisted of two blocks of 105 trials, corresponding to a word-naming and an object-naming block. The order of the blocks was counterbalanced across participants.


The experiment consisted of a familiarization phase and an experimental phase. During the familiarization phase, all object-word pairs were presented to the participant. Participants could go through this list at their own pace, by pressing a button for the next pair. They were asked to familiarize themselves with the stimuli in preparation for the subsequent word-naming and object-naming blocks. We thought this training was necessary because a pilot study (not reported) revealed that, without a training phase, participants hesitated (“uhh”) a lot while naming the objects (which dramatically impedes voice-onset detection) and often gave “incorrect” (according to Rossion & Pourtois, 2004) verbal responses. A familiarization phase is commonly used in object-naming paradigms (e.g., Riès, Legou, Burle, Alario, & Malfait, 2012).

Next, the experimental phase started with a nine-point grid calibration procedure. Every trial started with a one-point eye-tracker recalibration (“drift correction”). After a stable fixation was detected (at least 30 consecutive samples that did not deviate more than 0.3°), the stimulus appeared at one of the five possible displacement positions. Participants verbally named the stimulus as rapidly as possible. The stimulus remained on screen for 1,500 ms. To maximize the chance that the entire acoustic signal was saved, voice recording (but not eye-position recording) continued during the inter-trial interval (1,000 ms). This resulted in acoustic signals of 2,500 ms per trial. Typical trial sequences are shown in Fig. 2 (left panel).

Statistical analyses

Although the initial fixation position in a word can also affect other dependent variables (e.g., gaze duration and the duration of the second fixation; Vitu et al. 2001, 2007), for the current study we limited ourselves to response times (RTs), number of refixations, and initial-fixation durations. We investigated the effect of initial fixation position (for objects vs. words) on these three dependent variables. The purpose was to answer the following questions:
  1. 1.

    Do we observe (inverted) u-shaped (I-)OVP curves for objects, just like we do for words? If so, do they differ in strength between the two stimulus types?

  2. 2.

    Do we observe a leftward bias for objects just like for words?


To answer these questions, we ran separate linear mixed effects (LME) models for the different dependent variables (by using the R package “lme4;” Bates, Mächler, Bolker, & Walker, 2014).

Firstly, following Yao-N'Dré, Castet and Vitu (2013) we split the factor Displacement (Fig. 2) into two separate variables: Absolute Displacement (indicating the absolute distance of the eyes relative to the stimulus' center), and Fixation Side (Left or Right), corresponding to the side the eyes initially fixated (see Fig. 3). We did this in order to be able to estimate the strength and the asymmetry of the (I-)OVP curves separately. (For an alternative way using non-LME models, see Nuthmann, 2013b).
Fig. 3

Distributions of initial fixation position relative to the center of the stimulus. The factor Displacement was split into two separate factors: Absolute Displacement and Fixation Side (Left or Right). To give an impression of how the normalized displacement values (upper two x labels) compared to physical units, we also plotted the corresponding values in visual degrees for the two most extreme stimulus sizes: the smallest and the largest (lower two x labels, see also Table 1)

As described above, participants had to fixate within an area of 0.3° around the center of the fixation point in order to trigger the appearance of the stimulus. Thus, the eyes were often not exactly aligned with the center of the fixation point. Therefore, we used the actual horizontal deviation of the eyes (still relative to the center, and normalized on stimulus width) as our independent variable. This resulted in a continuous variable, roughly ranging between 0 and 0.5, instead of a categorical variable with three levels (0, 0.25, or 0.5). The resulting continuous values were close to, but not identical to, the position of the fixation dot (see Fig. 3). We excluded the few trials on which the actual fixation position was exactly zero (see “Data processing and selection” for the percentage of trials excluded), because in this case we could not attribute a level (Left or Right) to the factor Fixation Side.

For the continuous dependent variables (RTs and fixation durations), we ran LME models with the following fixed effects: Stimulus Type (Object or Word), Absolute Displacement (continuous variable), Fixation Side (Left or Right), the interaction between Stimulus Type and Absolute Displacement, and the interaction between Stimulus Type and Fixation Side. The random-effect structure included by-participant and by-item random slopes and intercepts for all predictors. To avoid the models from having more free parameters than can be reasonably derived from the data, we have left out the interactions from the random structures.

To investigate the refixation-OVP effect, our dependent variable was the number of refixations that participants made on the stimulus, before giving their verbal response. Reading studies have typically reported refixation probability as a function of displacement from the word's center (Engbert, Nuthmann, Richter, & Kliegl, 2005; McConkie et al., 1989; Vitu et al., 2001). However, in the current study this dependent variable was not appropriate, because participants almost always refixated the stimulus at least once (91.37 %). Therefore, we used the number of refixations instead. We used a generalized LME model with a Poisson family to analyze this dependent variable (Jaeger, 2008).

All models provided the intercept estimate (value of the dependent variable) when all variables were at their reference values (i.e., “Word” for Stimulus Type, 0 for Absolute Displacement and “Left” for Fixation Side). To test the same effects for Objects, we ran the same models, but re-leveled Stimulus Type such that “Object” became the reference value. For LME models, an absolute t-value of 2 or greater indicates significance at the alpha level of .05 (cf. Baayen, Davidson, & Bates, 2008).

The results of these models helped us to answer our two main questions in the following manner:
  1. 1.

    The effect of Absolute Displacement indicated whether there was an (I-)OVP effect for words (the default reference value) and objects (after re-leveling Stimulus Type and using Object as the reference value).The interaction between Absolute Displacement and Stimulus Type indicated whether the strength of the (I-)OVP effects differed between words and objects.

  2. 2.

    The effect of Fixation Side indicated whether there was a leftward bias for words (the default reference value) and objects (when using Object as the reference value). The interaction between Fixation Side and Stimulus Type indicated whether the strength of the leftward bias (if present) differed between words and objects.


Data processing and selection

To estimate RTs, we first automatically determined the onset of the acoustic signal by using the package OnsetDetective (van der Linden et al., 2014). Next, the first author visually inspected the resulting onsets (while being blind to the condition), and, if necessary, manually adapted them. RTs were computed as the difference between stimulus onset and voice onset (Oldfield & Wingfield, 1964). Fixations and saccades were detected using the built-in EyeLink saccade/fixation-detection algorithm with the default parameters.

We excluded trials on which the onset of the vocal response could not be detected due to a too poor signal-to-noise ratio (1.05 %), a saccade started but did not finish during the stimulus-response interval (0.13 %), participants did not give the expected verbal response (2.63 %), or the x coordinate of the actual initial fixation position was exactly identical to zero (0.08 %). Finally, we discarded trials of which the dependent variable differed more than 2.5 SD from the participant's condition mean (RT: 2.23 %, initial-fixation duration: 3.62 %). After selection, objects and words did not differ in average length (objects: M = 5.98°, SD = 1.24°, words: M = 5.98°, SD = 1.23°). Thus, none of the effects reported below is potentially confounded by an effect of stimulus length.


Response times

To investigate RT as a function of initial fixation position, we ran the above-described LME models (see “Statistical Analyses”) with RT as the dependent variable. The results are shown in Fig. 4 and Table 2. We found an effect of Absolute Displacement for both words (estimate = 51.17 ms, SE = 12.72 ms, t = 4.02, see Table 2) and objects (estimate = 37.93, SE = 13.24 ms, t = 2.87, obtained with Object as the reference value, not shown in Table 2). This indicates that RTs were shortest when the eyes initially fixated the middle of the stimulus, and gradually increased with horizontal deviation. The interaction between Absolute Displacement and Stimulus Type did not reach significance (estimate = −13.25 ms, SE = 17.92 ms, t = -0.74). Thus, we found no evidence for the RT-OVP effect being stronger for one of the two stimulus types.
Fig. 4

Mean response times (RTs) as a function of Stimulus Type and Displacement

Table 2

Results for the fixed effects in the linear mixed effects (LME) analyses with response time (RT) as the dependent variable. For Stimulus Type, the reference value is Word. For Fixation Side, the reference value is Left









Stimulus Type (Object)




Fixation Side (Right)




Absolute Displacement




Stimulus Type (Object): Fixation Side (Right)




Stimulus Type (Object): Absolute Displacement




For words, RTs varied as a function of Fixation Side. In line with previous findings (O’Regan & Jacobs, 1992) participants were faster to name words that they initially fixated on the left side as compared to the right side (estimate = 23.32 ms, SE = 5.24 ms, t = 4.46). For objects, RTs did not vary as a function of Fixation Side (estimate = −5.68 ms, SE = 5.42 ms, t = −1.05). The interaction between Fixation Side and Stimulus Type was reliable (estimate = −29.00 ms, SE = 6.66 ms, t = −4.35). These results suggest that there was a leftward bias for words but not for objects.

Finally, we observed a main effect of Stimulus Type, indicating that participants were much faster to read aloud written words than to name the corresponding objects (estimate = 237.35 ms, SE = 16.67 ms, t = 14.23). This is a well-known effect (Cattell, 1885; Ferrand, 1999; Fraisse, 1969; Theios & Amrhein, 1989), and is explained by the fact that to name an object, semantic information is necessary, whereas this is not the case for word reading (Theios & Amrhein, 1989). Also, the relationship between stimulus and response is more uncertain for objects (e.g., “flower” or “rose” for a picture of a rose) than for words (Ferrand, 1999; Fraisse, 1969).


Next, we examined whether the number of refixations differed as a function of initial fixation position. The results are shown in Fig. 5 (left) and Table 3. We found that participants made gradually more refixations when the deviation from the center increased. This was the case for both words (effect of Absolute Displacement, estimate = 1.09, SE = 0.09, z = 11.02, p < .00001) and objects (estimate = 0.68, SE = 0.09, z = 7.21, p < .00001). Our analyses also revealed an interaction between Absolute Displacement and Stimulus Type (estimate = −0.42, SE = 0.11, z = −3.94, p = .0001). This indicates that there was a refixation-OVP effect for both stimulus types, but that it was stronger for words than for objects.
Fig. 5

Mean number of refixations (left) and mean initial-fixation duration (right) as a function of Stimulus Type and Displacement

Table 3

Results for the fixed effects in the linear mixed effects (LME) analyses with Number of refixations as the dependent variable. For Stimulus Type, the reference value is Word. For Fixation Side, the reference value is Left











Stimulus Type (Object)





Fixation Side (Right)





Absolute Displacement





Stimulus Type (Object): Fixation Side (Right)





Stimulus Type (Object): Absolute Displacement





Furthermore, we found an effect of Fixation Side on the number of refixations for words (estimate = 0.12, SE = 0.03, z = 3.77, p = .0002) but not for objects (estimate = 0.0001, SE = 0.03, z = 0.004, p = .9965). The interaction between Fixation Side and Stimulus Type was reliable (estimate = −0.12, SE = 0.04, z = −2.91, p = .0036). These results reveal that participants made less refixations when their eyes were initially on the left side of a word as compared to the right side of a word. This leftward bias was not present for objects.

Initial-fixation duration

Finally, we examined whether the duration of the initial fixation differed as a function of displacement. The results are shown in Fig. 5 (right) and Table 4. We found an effect of Absolute Displacement on fixation duration for both words (estimate = −474.87 ms, SE = 57.52 ms, t = −8.26) and objects (estimate = −311.98, SE = 51.66, t = −5.41). Fixation duration was longest when the eyes initially fixated the stimulus center, and gradually decreased when the eyes' initial deviation became larger. Furthermore, the analyses revealed an interaction between Absolute Displacement and Stimulus Type (estimate = 162.90 ms, SE = 32.37, t = 5.03). Together, these results indicate that there was a fixation-duration I-OVP effect for both stimulus types, and that it was stronger for words than for objects.
Table 4

Results for the fixed effects in the linear mixed effects (LME) analyses with Initial-fixation duration as the dependent variable. For Stimulus Type, the reference value is Word. For Fixation Side, the reference value is Left









Stimulus Type (Object)




Fixation Side (Right)




Absolute Displacement




Stimulus Type (Object): Fixation Side (Right)




Stimulus Type (Object): Absolute Displacement




Furthermore, the analyses revealed an effect of Fixation Side for words (estimate = −54.26 ms, SE = 9.38 ms, t = −5.79) but not for objects (estimate = −13.85, SE = 9.58 ms, t = −1.45). The interaction between Stimulus Type and Fixation Side was reliable (estimate = 40.41 ms, SE = 11.92 ms, t = 3.39). Thus, fixation duration was longer when the eyes were on the left side of a word than when the eyes were on the right side of a word. This leftward bias was not present for objects.

Landing position of refixations

The above analyses reflected (I-)OVP effects for words as well as for objects, such that the ease with which the stimulus was processed depended on the initial viewing position. For objects the (I-)OVP curves appeared symmetric around the object's center, whereas for words they were asymmetric and shifted to the left of the word's center. To further investigate whether the leftward bias is unique for words, we tested whether participants indeed moved their eyes towards the supposedly optimal position in cases where they did not start off at this position (cf. Foulsham & Kingstone, 2013). To this end, we ran an extra LME model with the landing position of refixations as the dependent variable. Landing positions were expressed relative to the stimulus' width, such that a value of −0.5 meant that the eyes landed at the extreme left border of the stimulus, 0.5 indicated that the eyes landed at the extreme right border, and 0 indicated that the eyes landed at the middle of the stimulus. We entered Displacement, Stimulus Type (Object or Word), and the interaction between the two as fixed effects. The random-effect structure included by-participant and by-item intercepts and slopes for all predictors (again, we left out the interactions). We analyzed trials on which participants made at least one refixation (91.37 %), and excluded trials on which the landing position deviated > 2.5 SD from participant's condition mean (1.88 %).

Figure 6 shows distributions of landing positions for the five levels of Displacement for Objects and Words. When the eyes initially fixated the left part of the stimulus, refixations shifted the eyes rightwards (e.g., the distribution of landing positions peaked to the right of the green line in Fig. 6a), and vice versa when the eyes initially fixated on the right part (e.g., the distribution peaked to the left of the green line in Fig. 6e). Importantly, the distributions for words were systematically shifted to the left of the distributions for objects. This confirms a greater leftward bias for words than for objects. The most illustrative case was when initial fixations were near the center of the stimulus: for words, refixations showed a leftward bias, whereas for objects they were equally likely to be to the left or the right of the center.
Fig. 6

Distributions of landing positions of the first within-object refixation as a function of initial displacement and stimulus type. Black dotted lines indicate the initial displacement. Gray dotted lines indicate the stimulus' center

This was confirmed by our LME analysis, which revealed a main effect of Stimulus Type, indicating that refixations within words were more leftwards as compared to refixations within objects (estimate = 0.065, SE = 0.014, t = 4.65, see Table 5). Also, when Displacement was at its reference value (0), landing positions within words were biased towards the left side of the word (estimate = −0.057, SE = 0.008, t = −6.73, see Table 5 and Fig. 6c, distribution with filled markers), whereas for objects there was no reliable bias away from the center (estimate = 0.008, SE = 0.013, t = 0.59, Fig. 6c, distribution with unfilled markers).
Table 5

Results for the fixed effects in the linear mixed effects (LME) analyses with Refixation landing position as the dependent variable. The reference value for Stimulus Type is Word









Stimulus Type (Object)








Stimulus Type (Object): Displacement




Finally, we found a main effect of Displacement (estimate = 0.21, SE = 0.02, t = 10.93). This indicates that as the initial fixation shifted away from the center of the stimulus by 25 % of the stimulus' width, the refixation shifted five times less. Thus, when the eyes initially fixated on the left side of the object, the subsequent refixation was directed rightwards, and landed on a less-left position. The reverse was true when the eyes initially fixated on the right side of the object.


Many studies have shown that the ease with which a written word is processed depends on where the eyes initially fixate at it. The optimal viewing position (OVP) is found to be at the word's center (for short words) or just to the left of the center (for longer words), at least for languages that are read from left to right (O’Regan et al., 1984; for reviews see Brysbaert & Nazir, 2005; Rayner, 1998; Vitu et al., 2001). Fixating on this position facilitates word identification (O’Regan & Jacobs, 1992) and lowers the chance of within-word refixations (McConkie et al., 1989; O’Regan & Lévy-Schoen, 1987; Rayner et al., 1996; Vitu et al., 1990), but lengthens initial fixation durations (I-OVP effect, Vitu et al., 2001). Several studies have shown that initial viewing position also influences object processing (Foulsham & Kingstone, 2013; Henderson, 1993; Pajak & Nuthmann, 2013). However, these studies did not directly compare the resulting (I-)OVP curves between words and objects. Comparing the effects between the two stimulus types is interesting, because it might help to disentangle the contribution of visual versus language constraints.

Therefore, the purpose of the current study was to investigate the effect of initial-fixation position on task performance and eye movements for words versus objects. We confirmed the existence of (I-)OVP effects for both stimulus types. Interestingly, however, we found differences in both the strength and the asymmetry of the (I-)OVP curves. Firstly, (I-)OVP effects were weaker for objects than for words. Secondly, for words, the optimum of the curve was shifted to the left of the word's center, whereas for objects we found no asymmetry.

All viewing-position effects generalize to object viewing

The (inverted) u-curve of (I-)OVP effects for words is traditionally explained in terms of letter visibility and is assumed to result from both the decrease of visual acuity with retinal eccentricity and crowding (McConkie et al., 1989; Nazir et al. 1991, 1998). The decrease of visual acuity with eccentricity is due to the non-homogeneity of the retina. Therefore, it should not only constrain the processing of letter strings, but also other visual stimuli such as objects. Thus, visual-information uptake should be enhanced when the eyes fixate an object's center (benefiting maximally from high-acuity foveal vision and suffering least from crowding) as compared to the objects' boundaries ('losing' some high-acuity vision on off-object locations, while suffering from parafoveal vision and crowding on the non-fixated side). Our results confirmed this. We found that participants were faster to name isolated objects and words when their eyes initially fixated in the middle of the stimulus, and became gradually slower when their eyes initially fixated towards the boundaries. Similarly, refixations decreased when the eyes started fixating on the center compared to the boundaries of the stimulus.

Initial fixation durations also varied with fixation location. Fixations were longer towards the center of isolated words and objects, which is in line with previous reports on word identification (Vitu et al., 2007), sentence reading (Hyönä & Bertram, 2011; Nuthmann et al., 2005, 2007; Vitu et al., 2001), object recognition (Henderson, 1993) and scene viewing (Pajak & Nuthmann, 2013). This effect, referred to as the inverted OVP (I-OVP) effect, might appear counterintuitive when assuming that fixation duration is an index of processing difficulty. Yet the fact that the resulting curves have the same shape (although inverted) as the other OVP effects suggests that all reflect the same phenomenon. The exact relationship between the three dependent variables is a research topic in itself, and has been the subject of an interesting debate (see e.g., Nuthmann et al., 2005, 2007; Vitu et al. Vitu et al. 2001, 2007).

Following Vitu and colleagues' (2007) perceptual economy account, we could reason as follows: Due to lifelong experience with reading and scene viewing, the visual system has learned that stimuli are identified best when the eyes initially fixate near the center. When the eyes land on (or, as in the current study, are exposed to) a stimulus, the visual system estimates the location of the eyes. If the eyes are estimated to be close to the OVP, fixation is prolonged, because optimal visual-information uptake is anticipated (Vaughan, 1978). This gives rise to the fixation-duration I-OVP curve. In turn, because stimulus processing occurs in parallel with the initial fixation, the likelihood of a refixation decreases, giving rise to the refixation-OVP curve. The RT-OVP curve then results from the combination of both refixation likelihood and individual fixation duration. The fact that (I-)OVP effects generalize to object viewing suggests that similar perceptual-economy processes may be at work during word and object recognition.

This interpretation is appealing for the present results on isolated stimuli. However, it does not necessarily generalize to objects in scenes, because Pajak and Nuhtmann (2013) did not find a refixation-OVP effect for objects in scenes, and only a fixation-duration I-OVP effect for large objects. Nevertheless, the fact that we found (I-)OVP effects in both word and object viewing in the current study points to the universal character of perceptual-economy processes (but see Nuthmann, 2013a; Nuthmann et al., 2007).

Importantly, we also found differences between the (I-)OVP effects for objects versus words. The curves differed on two aspects: asymmetry and steepness (i.e., the strength of the effect of initial-fixation position). We discuss both differences below.

Differences between words and objects

Regarding the asymmetry of the curves, we found that the curves for words were systematically biased to the left (O’Regan & Jacobs, 1992), whereas for objects they were symmetric. As outlined in the Introduction, several explanations have been proposed for the leftward bias in word processing: reading habits (Nazir et al., 2004), hemispheric specialization (Brysbaert, 1994), and word ambiguity (Clark & O’Regan, 1999). Some indirect evidence suggests that two of these mechanisms, hemispheric specialization and reading habits, might transfer to object processing: Object naming benefits from a right-visual field advantage (Hunter & Brysbaert, 2008), and reading habits have been found to generalize to face viewing (Heath et al., 2005; Hsiao & Cottrell, 2008; Vaid & Singh, 1989). This could have favored a leftward asymmetry of (I-)OVP curves in objects.

However, we did not observe this. For objects, (I-)OVP curves were symmetric. This might indicate that lexical ambiguity is crucial for the leftward bias. Clark and O’Regan (1999) showed that fixating just to the left of a word's center most strongly reduces lexical ambiguity. This is in line with the fact that, on average, the beginning of a word is most informative (Holmes & O’Regan, 1987; O’Regan et al., 1984). For objects, on the other hand, there is no systematic bias in the distribution of diagnostic information, at least across object categories and visual orientations (see Footnote 1). This might explain the absence of an asymmetry in (I-)OVP curves for objects. Future studies could test this claim, for example by manipulating the distribution of information within objects, by making the “beginning” of the object more informative. For example, all animals could be presented with their head to the left, all utensils could be oriented with their action-identifying part to the left (van der Linden, Mathôt, & Vitu, 2015), etc. In this case, Clark and O'Regan's ambiguity account might predict a leftward bias.2

Pseudo-neglect (Bowers & Heilman, 1980) is another factor that might generate (I-)OVP asymmetries in objects, though in the opposite direction, that is, favoring the right side of the object. This might be worth investigating in future studies. Pseudo-neglect refers to the phenomenon that the majority of observers favor attending to the left visual field. For example, observers typically start exploring scenes by saccading to the left side of the image (Foulsham, Gray, Nasiopoulos, & Kingstone, 2013; Nuthmann & Matthias, 2014). It would be interesting to examine how this relates to (I-)OVP effects in objects. If pseudo-neglect affects isolated object viewing, the optimum of the curves should shift to the right, such that the majority of the stimulus falls into the preferred left visual field. To test this, a non-language-related task should be used, to avoid language-related asymmetries from counteracting any potential pseudo-neglect effects.

Regarding the strength of the (I-)OVP effects, we found that the effects on eye movements (refixation-OVP and fixation-duration I-OVP) were weaker for objects than for words. There are several possible explanations for this difference. Firstly, as outlined in the Introduction, Clark and O'Regan (1999) suggested that the (I-)OVP curves for word identification are steeper than they would be due to visual acuity and crowding only. According to these authors, not only visual constraints but also lexical constraints shape the u-curves. As mentioned above, this “boosting” effect of lexical ambiguity is probably specific to words. This might explain why we found stronger (I-)OVP effects when participants were identifying words as compared to objects.

Secondly, it is possible that crowding influences the visibility of letters as well as the visibility of object features, but not to the same extent. Perhaps the number of object features in our object stimuli was larger (or smaller) than the number of letters or letter features, or did not have similar levels of similarity. As a consequence, the influence of crowding on the OVP curves might have been stronger for words than for objects.

Thirdly, the strength difference might be explained by low-level visual properties of the stimuli that were not controlled in the current study. Words are much wider than they are high, whereas this is, in general, not the case for objects. As a consequence, manipulating horizontal initial-fixation position might have a stronger effect on word processing than object processing, because for the latter, the vertical position can compensate for any inconvenient horizontal position.

Finally, in general, and in our study as well, naming an object takes much longer than naming a word (Cattell, 1885; Ferrand, 1999; Fraisse, 1969; Theios & Amrhein, 1989). This increase in RT may have made eye-movement behavior on objects less sensitive to the influence of initial-fixation position. For example, the number of refixations on objects (with longer RTs) might be more susceptible to ceiling effects than the number of refixations on words (with shorter RTs).

In contrast to the eye-movement measures, the RT-OVP effect did not reveal a difference in strength between objects and words. We think this can be explained as follows. As mentioned by Vitu and colleagues (Vitu, 2011; Vitu & O’Regan, 2004), RT is a composite measure, reflecting the sum of all fixation durations on a stimulus. Thus, the slope of the RT-OVP curve depends on the number of refixations and the duration of all fixations on the stimulus (and not just the duration of the first fixation). In our study, participants made many refixations. Therefore, the duration of refixations (instead of initial fixations) may have compensated for the strength difference in refixation-OVP and fixation-duration I-OVP curves.

Landing positions of refixations

In our study, participants almost always made at least one refixation before initiating their verbal response. This high refixation rate can be explained by several factors, or a combination of them. Firstly, our stimuli were large (with the width ranging between 3.41° and 6.82°, see Table 1). Secondly, a naming task is more complex than for example the object/non-object task used by Foulsham and Kingstone (2013). This results in long stimulus-response intervals (see Fig. 4), increasing the likelihood that the initial fixation will at some point be aborted.

We investigated the landing position of these refixations. In line with our (I-)OVP results, we found that refixations within words showed a leftward bias, whereas refixations within objects showed a central bias. Of particular interest were the trials where the eyes started at the center. In this case, when viewing a word, participants systematically made a refixation to the left, thus likely towards the most informative part (Clark & O’Regan, 1999). When viewing an object at its center, participants also tended to refixate. This is more surprising, because their eyes were already at the assumed optimal position. During these refixations, the eyes showed no systematic bias towards either side of the object. This is consistent with Pajak and Nuthmann's (2013) findings for objects in natural scenes. They observed that refixations launched from object boundaries bring the eyes closed to the object's center, whereas refixations that are launched from the center keep the eyes close to the center.

(I-)OVP effects in naturalistic scene viewing?

The here-observed (I-)OVP effects on task and eye-movement performance are in line with several previous studies on isolated objects (Foulsham & Kingstone, 2013, Experiment 1; Foulsham & Underwood, 2009; Henderson, 1993). To what extent do these effects generalize to objects embedded in natural scenes? Two previous studies investigated this question (Foulsham & Kingstone, 2013; Pajak & Nuthmann, 2013). In line with isolated-object studies, Pajak and Nuthmann observed a fixation-duration I-OVP effect (for large objects). With regard to refixation behavior, however, neither of these two studies found a typical refixation-OVP curve. This is a discrepancy with our current study, and needs an explanation.

A first difference between the objects-in-scenes studies and our current study is that scene studies used a free-viewing paradigm, whereas we experimentally imposed fixation positions. Yet, it seems unlikely that this methodological difference explains the difference in refixation-OVP curves. After all, free-viewing paradigms using isolated objects did reveal classical refixation-OVP effects (Foulsham & Underwood, 2009; Henderson, 1993).

Instead, the discrepancy between objects-in-scene studies and our study might indicate that refixation behavior changes depending on the relevance of objects for the task at hand. In our current experiment, participants necessarily had to identify/recognize the objects to make a correct response. During free viewing, however, some objects might be fixated for different purposes than identifying them (for a similar argument, see Foulsham & Kingstone, 2013). Furthermore, scene context might facilitate object identification, which, in turn, reduces the overall need to refixate objects. Moreover, Pajak and Nuthmann (2013) speculate that the larger perceptual span in scene perception may reduce the need for corrective refixations (Nuthmann, 2013b). (However, none of this would explain why Foulsham and Kingstone found an inverted refixation-OVP effect in natural scene).

Finally, it has been proposed that the refixation-OVP and the fixation-duration I-OVP effects rely on the extraction of stimulus boundaries (O’Regan, 1990; O’Regan & Lévy-Schoen, 1987; Vitu et al. 2001, 2007; see also Nuthmann et al., 2005). The presence of a complex background in visual scenes might make object boundaries less visible (although not invisible), making their extraction more difficult. This, in turn, might flatten the OVP curves for objects in scenes. To test this, a logical follow-up to the current study would be to place the objects on a complex background, while keeping the rest of the design constant. This would enable a more direct comparison between (I-)OVP effects for isolated objects on homogeneous backgrounds versus objects embedded in realistic visual scenes.


We investigated whether there is an optimal viewing position (OVP) for the visual processing of objects, just like there is for words. We examined task performance and eye movements as a function of where the eyes initially fixated in a word or an object (cf. Foulsham & Kingstone, 2013, Experiment 1), and found that object viewing was optimal when the eyes fixated in the middle of the object. The further the eyes were away from this OVP, the larger the cost on task performance and eye movements. These effects were weaker for objects than for words. Moreover, (I-)OVP curves for objects were symmetric around the center, whereas (I-)OVP curves for words showed the typical leftward bias. This difference might indicate that part of (I-)OVP effects for words is language specific, and hence that (I-)OVP effects for objects are a purer measure of the influence of visual constraints.


  1. 1.

    This is not to say that the center of an object always contains the most diagnostic information. It is likely that different objects have different most-diagnostic locations. For example, in the case of animals and people, fixating the head might be most informative (cf. the preferred-viewing location as a function of object category reported by Yun, Peng, Samaras, Zelinsky, & Berg, 2013), whereas in the case of tools, fixating the action-performing part might be most informative (van der Linden, Mathôt, & Vitu, 2015). Thus, different objects might have different optimal and/or preferred viewing positions. However, when objects are randomly selected with respect to this feature, we do not predict that this feature will show a systematic bias away from the center. This is a clear difference between objects and words. A set of randomly selected (French) words will most probably follow Clark and O'Regan's (1999) lexical-ambiguity curves, and hence contain more diagnostic information at the left side (the beginning of the word).

  2. 2.

    The “bubbles” technique described by Gosselin and Schyns (2001) could be used to obtain a more formal index of the distribution of information within objects. This technique reveals the parts of a stimulus that are diagnostic of performance (in our case, correct object identification).



Lotje van der Linden was supported by a grant (“allocation de recherche”) from the French Ministry of Research (2012–2015). We presented the current study at the 2014 Annual Meeting of the Vision Sciences Society in St. Pete Beach, Florida, USA. We thank Rouchdati Cheikh for helping us with the data collection. We also thank Marc Brysbaert, Tom Foulsham, and Antje Nuthmann for their helpful comments on an earlier version of this manuscript.


  1. Baayen, R. H., Davidson, D., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412.CrossRefGoogle Scholar
  2. Bates, D. M., Mächler, M., Bolker, B., & Walker, S. (2014). lme4: Linear mixed-effects models using Eigen and S4. R package version 1.0.Google Scholar
  3. Bouma, H. (1970). Interaction effects in parafoveal letter recognition. Nature, 226, 177–178.CrossRefPubMedGoogle Scholar
  4. Bowers, D., & Heilman, K. M. (1980). Pseudoneglect: Effects of hemispace on a tactile line bisection task. Neuropsychologia, 18(4), 491–498.CrossRefPubMedGoogle Scholar
  5. Bryden, M. P. (1982). Laterality: Functional asymmetry in the intact brain. New York: Academic Press.Google Scholar
  6. Brysbaert, M. (1994). Interhemispheric transfer and the processing of foveally presented stimuli. Behavioural Brain Research, 64(1), 151–161.CrossRefPubMedGoogle Scholar
  7. Brysbaert, M. (2004). The importance of interhemispheric transfer for foveal vision: A factor that has been overlooked in theories of visual word recognition and object perception. Brain and Language, 88(3), 259–267.CrossRefPubMedGoogle Scholar
  8. Brysbaert, M., Cai, Q., & van der Haegen, L. (2012). Brain asymmetry and visual word recognition: Do we have a split fovea? In J. S. Adelman (Ed.), Visual word recognition, Volume 1: Models and methods, orthography and phonology (pp. 139–158). Hove: Psychology Press.Google Scholar
  9. Brysbaert, M., & Nazir, T. A. (2005). Visual constraints in written word recognition: evidence from the optimal viewing-position effect. Journal of Research in Reading, 28(3), 216–228.CrossRefGoogle Scholar
  10. Brysbaert, M., Vitu, F., & Schroyens, W. (1996). The right visual field advantage and the optimal viewing position effect: On the relation between foveal and parafoveal word recognition. Neuropsychology, 10, 385–395.CrossRefGoogle Scholar
  11. Cattell, J. M. (1885). Über die zeit der Erkennung und Benennung von Schriftzeichen, Bildern und Farben [The time it takes to recognize and name letters, pictures, and colors]. Philisophische Studien, 2, 635–650.Google Scholar
  12. Clark, J. J., & O’Regan, J. K. (1999). Word ambiguity and the optimal viewing position in reading. Vision Research, 39(4), 843–857.CrossRefPubMedGoogle Scholar
  13. Engbert, R., Nuthmann, A., Richter, E. M., & Kliegl, R. (2005). SWIFT: a dynamical model of saccade generation during reading. Psychological Review, 112(4), 777.CrossRefPubMedGoogle Scholar
  14. Farid, M., & Grainger, J. (1996). How initial fixation position influences visual word recognition: A comparison of French and Arabic. Brain and Language, 53(3), 351–368.CrossRefPubMedGoogle Scholar
  15. Ferrand, L. (1999). Why naming takes longer than reading? The special case of Arabic numbers. Acta Psychologica, 100(3), 253–266.CrossRefGoogle Scholar
  16. Foulsham, T., Gray, A., Nasiopoulos, E., & Kingstone, A. (2013). Leftward biases in picture scanning and line bisection: A gaze-contingent window study. Vision Research, 78, 14–25.CrossRefPubMedGoogle Scholar
  17. Foulsham, T., & Kingstone, A. (2013). Optimal and preferred eye landing positions in objects and scenes. The Quarterly Journal of Experimental Psychology, 66(9), 1707–1728.CrossRefPubMedGoogle Scholar
  18. Foulsham, T., & Underwood, G. (2009). Does conspicuity enhance distraction? Saliency and eye landing position when searching for objects. The Quarterly Journal of Experimental Psychology, 62(6), 1088–1098.CrossRefPubMedGoogle Scholar
  19. Fraisse, P. (1969). Why is naming longer than reading? Acta Psychologica, 30, 96–103.CrossRefGoogle Scholar
  20. Gosselin, F., & Schyns, P. G. (2001). Bubbles: a technique to reveal the use of information in recognition tasks. Vision Research, 41(17), 2261–2271.CrossRefPubMedGoogle Scholar
  21. Heath, R., Rouhana, A., & Abi Ghanem, D. (2005). Asymmetric bias in perception of facial affect among Roman and Arabic script readers. Laterality: Asymmetries of Body, Brain, and Cognition, 10(1), 51–64.Google Scholar
  22. Hellige, J. B. (1990). Hemispheric asymmetry. Annual Review of Psychology, 41(1), 55–80.CrossRefPubMedGoogle Scholar
  23. Henderson, J. M. (1993). Eye movement control during visual object processing: Effects of initial fixation position and semantic constraint. Canadian Journal of Experimental Psychology/Revue Canadienne de Psychologie Expérimentale, 47(1), 79–98.CrossRefPubMedGoogle Scholar
  24. Holmes, V. M., & O’Regan, J. K. (1987). Decomposing French words. In J. K. O’Regan & A. Lévy-Schoen, Eye movements: From physiology to cognition (pp. 459–466). Amsterdam, The Netherlands.Google Scholar
  25. Hsiao, J. H., & Cottrell, G. (2008). Two fixations suffice in face recognition. Psychological Science, 19(10), 998–1006.CrossRefPubMedGoogle Scholar
  26. Hunter, Z. R., & Brysbaert, M. (2008). Visual half-field experiments are a good measure of cerebral language dominance if used properly: Evidence from fMRI. Neuropsychologia, 46(1), 316–325.CrossRefPubMedGoogle Scholar
  27. Hyönä, J., & Bertram, R. (2011). Optimal viewing position effects in reading Finnish. Vision Research, 51(11), 1279–1287.CrossRefPubMedGoogle Scholar
  28. Inhoff, A. W., & Rayner, K. (1986). Parafoveal word processing during eye fixations in reading: Effects of word frequency. Perception & Psychophysics, 40(6), 431–439.CrossRefGoogle Scholar
  29. Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59, 434–446.PubMedCentralCrossRefPubMedGoogle Scholar
  30. Levi, D. M., Klein, S. A., & Aitsebaomo, A. P. (1985). Vernier acuity, crowding and cortical magnification. Vision Research, 25(7), 963–977.CrossRefPubMedGoogle Scholar
  31. Martelli, M., Majaj, N. J., & Pelli, D. G. (2005). Are faces processed like words? A diagnostic test for recognition by parts. Journal of Vision, 5(1), 58–70. doi: 10.1167/5.1.6 CrossRefPubMedGoogle Scholar
  32. Mathôt, S., Schreij, D., & Theeuwes, J. (2012). OpenSesame: An open-source, graphical experiment builder for the social sciences. Behavior Research Methods, 44, 1–314–324. doi: 10.3758/s13428-011-0168-7 CrossRefGoogle Scholar
  33. McConkie, G. W., Kerr, P. W., Reddix, M. D., Zola, D., & Jacobs, A. M. (1989). Eye movement control during reading: II. Frequency of refixating a word. Perception & Psychophysics, 46(3), 245–253.CrossRefGoogle Scholar
  34. McConkie, G. W., & Rayner, K. (1976). Asymmetry of the perceptual span in reading. Bulletin of the Psychonomic Society, 8(5), 365–368.CrossRefGoogle Scholar
  35. Nazir, T. A. (2000). Traces of print along the visual pathway. In A. Kennedy, R. Radach, D. Heller, & J. Pynte (Eds.), Reading as a Perceptual Process (pp. 3–22). Amsterdam: Elsevier.CrossRefGoogle Scholar
  36. Nazir, T. A., Ben-Boutayab, N., Decoppet, N., Deutsch, A., & Frost, R. (2004). Reading habits, perceptual learning, and recognition of printed words. Brain and Language, 88(3), 294–311.CrossRefPubMedGoogle Scholar
  37. Nazir, T. A., Heller, D., & Sussmann, C. (1992). Letter visibility and word recognition: The optimal viewing position in printed words. Perception & Psychophysics, 52(3), 315–328.CrossRefGoogle Scholar
  38. Nazir, T. A., Jacobs, A. M., & O’Regan, J. K. (1998). Letter legibility and visual word recognition. Memory & Cognition, 26(4), 810–821.CrossRefGoogle Scholar
  39. Nazir, T. A., O’Regan, J. K., & Jacobs, A. M. (1991). On words and their letters. Bulletin of the Psychonomic Society, 29(2), 171–174.CrossRefGoogle Scholar
  40. New, B., Brysbaert, M., Veronis, J., & Pallier, C. (2007). The use of film subtitles to estimate word frequencies. Applied Psycholinguistics, 28(04), 661–677.CrossRefGoogle Scholar
  41. New, B., Pallier, C., Brysbaert, M., & Ferrand, L. (2004). Lexique 2: A new French lexical database. Behavior Research Methods, Instruments, & Computers, 36(3), 516–524.CrossRefGoogle Scholar
  42. New, B., Pallier, C., Ferrand, L., & Matos, R. (2001). Une base de données lexicales du français contemporain sur internet: LEXIQUETM//A lexical database for contemporary french: LEXIQUETM. L’année Psychologique, 101(3), 447–462.CrossRefGoogle Scholar
  43. Nuthmann, A. (2013a). Not fixating at the line of text comes at a cost. Attention, Perception, & Psychophysics, 75(8), 1604–1609.CrossRefGoogle Scholar
  44. Nuthmann, A. (2013b). On the visual span during object search in real-world scenes. Visual Cognition, 21(7), 803–837. doi: 10.1080/13506285.2013.832449 CrossRefGoogle Scholar
  45. Nuthmann, A., Engbert, R., & Kliegl, R. (2005). Mislocated fixations during reading and the inverted optimal viewing position effect. Vision Research, 45(17), 2201–2217.CrossRefPubMedGoogle Scholar
  46. Nuthmann, A., Engbert, R., & Kliegl, R. (2007). The IOVP effect in mindless reading: Experiment and modeling. Vision Research, 47(7), 990–1002.CrossRefPubMedGoogle Scholar
  47. Nuthmann, A., & Henderson, J. M. (2010). Object-based attentional selection in scene viewing. Journal of Vision, 10(8), 1–19.CrossRefGoogle Scholar
  48. Nuthmann, A., & Matthias, E. (2014). Time course of pseudoneglect in scene viewing. Cortex, 52, 113–119.CrossRefPubMedGoogle Scholar
  49. Oldfield, R. C., & Wingfield, A. (1964). The time it takes to name an object. Nature, 202, 1031–1032.CrossRefPubMedGoogle Scholar
  50. O’Regan, J. K. (1990). Eye movements and reading. In E. Kowler (Ed.), Reviews of oculomotor research (Vol. 4, pp. 395–453). Amsterdam, The Netherlands: Elsevier.Google Scholar
  51. O’Regan, J. K., & Jacobs, A. M. (1992). Optimal viewing position effect in word recognition: a challenge to current theory. Journal of Experimental Psychology. Human Perception and Performance, 18(1), 185–197.CrossRefGoogle Scholar
  52. O’Regan, J. K., & Lévy-Schoen, A. (1987). Eye-movement strategy and tactics in word recognition and reading. In M. Coltheart (Ed.), Attention and Performance XII: The Psychology of Reading (pp. 363–383). Hillsdale, NJ: Erlbaum.Google Scholar
  53. O’Regan, J. K., Lévy-Schoen, A., Pynte, J., & Brugaillère, B. (1984). Convenient fixation location within isolated words of different length and structure. Journal of Experimental Psychology: Human Perception and Performance, 10(2), 250–257.PubMedGoogle Scholar
  54. Pajak, M., & Nuthmann, A. (2013). Object-based saccadic selection during scene perception: Evidence from viewing position effects. Journal of Vision, 13(5), 1–21. doi: 10.1167/13.5.2 CrossRefGoogle Scholar
  55. Pelli, D. G. (2008). Crowding: A cortical constraint on object recognition. Current Opinion in Neurobiology, 18(4), 445–451.PubMedCentralCrossRefPubMedGoogle Scholar
  56. Pelli, D. G., & Tillman, K. A. (2008). The uncrowded window of object recognition. Nature Neuroscience, 11(10), 1129–1135. doi: 10.1038/nn.2187 PubMedCentralCrossRefPubMedGoogle Scholar
  57. Rayner, K. (1979). Eye guidance in reading: Fixation locations within words. Perception, 8(1), 21–30.CrossRefPubMedGoogle Scholar
  58. Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124(3), 372–422.CrossRefPubMedGoogle Scholar
  59. Rayner, K., Sereno, S. C., & Raney, G. E. (1996). Eye movement control in reading: a comparison of two types of models. Journal of Experimental Psychology: Human Perception and Performance, 22(5), 1188–1200.PubMedGoogle Scholar
  60. Rayner, K., Well, A. D., & Pollatsek, A. (1980). Asymmetry of the effective visual field in reading. Perception & Psychophysics, 27(6), 537–544.CrossRefGoogle Scholar
  61. Riès, S., Legou, T., Burle, B., Alario, F. X., & Malfait, N. (2012). Why does picture naming take longer than word reading? The contribution of articulatory processes. Psychonomic Bulletin & Review, 19, 955–961.CrossRefGoogle Scholar
  62. Rossion, B., & Pourtois, G. (2004). Revisiting Snodgrass and Vanderwart’s object pictorial set: The role of surface detail in basic-level object recognition. Perception, 33(2), 217–236.CrossRefPubMedGoogle Scholar
  63. Stevens, M., & Grainger, J. (2003). Letter visibility and the viewing position effect in visual word recognition. Perception & Psychophysics, 65(1), 133–151.CrossRefGoogle Scholar
  64. Theios, J., & Amrhein, P. C. (1989). Theoretical analysis of the cognitive processing of lexical and pictorial stimuli: Reading, naming, and visual and conceptual comparisons. Psychological Review, 96(1), 5–24.CrossRefPubMedGoogle Scholar
  65. Underwood, N. R., & McConkie, G. W. (1985). Perceptual span for letter distinctions during reading. Reading Research Quarterly, 153–162.Google Scholar
  66. Vaid, J., & Singh, M. (1989). Asymmetries in the perception of facial affect: Is there an influence of reading habits? Neuropsychologia, 27(10), 1277–1287.CrossRefPubMedGoogle Scholar
  67. Van der Haegen, L., Cai, Q., Stevens, M. A., & Brysbaert, M. (2013). Interhemispheric communication influences reading behavior. Journal of Cognitive Neuroscience, 25(9), 1442–1452.CrossRefPubMedGoogle Scholar
  68. van der Linden, L., Mathôt, S., & Vitu, F. (2015). The role of object affordances and center of gravity in eye movements toward isolated daily-life objects. Journal of Vision, 15(5), 1–18. doi: 10.1167/15.5.8 CrossRefGoogle Scholar
  69. van der Linden, L., Riès, S. K., Legou, T., Burle, B., Malfait, N., & Alario, F.-X. (2014). A comparison of two procedures for verbal response time fractionation. Frontiers in Psychology, 5. 10.3389/fpsyg.2014.01213
  70. Vaughan, J. (1978). Control of visual fixation duration in search. Eye Movements and the Higher Psychological Processes, 135–142.Google Scholar
  71. Vitu, F. (2011). On the role of visual and oculomotor processes in reading. In S. Liversedge, I. D. Gilchrist, & S. Everling (Eds.), The Oxford handbook of eye movements (pp. 731–749). Oxford, UK: Oxford University Press.Google Scholar
  72. Vitu, F., Lancelin, D., & Marrier d’Unienville, V. (2007). A perceptual-economy account for the inverted-optimal viewing position effect. Journal of Experimental Psychology: Human Perception and Performance, 33(5), 1220.PubMedGoogle Scholar
  73. Vitu, F., McConkie, G. W., Kerr, P., & O’Regan, J. K. (2001). Fixation location effects on fixation durations during reading: an inverted optimal viewing position effect. Vision Research, 41(25-26), 3513–3531.CrossRefPubMedGoogle Scholar
  74. Vitu, F., & O’Regan, J. K. (2004). Les mouvements oculaires comme indice «on-line» des processus cognitifs: rêve ou réalité? In L. Ferrand & J. Grainger (Eds.), Psycholinguistique cognitive: Essai en l’honneur de Juan Segui (pp. 189–214). Bruxelles: De Boeck Université, Collection Neurosciences et Cognition.Google Scholar
  75. Vitu, F., O’Regan, J. K., & Mittau, M. (1990). Optimal landing position in reading isolated words and continuous text. Perception & Psychophysics, 47(6), 583–600.CrossRefGoogle Scholar
  76. Wallace, J. M., & Tjan, B. S. (2011). Object crowding. Journal of Vision, 11(6). 10.1167/11.6.19
  77. Wertheim, T. H. (1894). Über die indirekte Sehschärfe. Zeitschrift Für Psychologie, 7, 172–187.Google Scholar
  78. Whitney, C. (2001). How the brain encodes the order of letters in a printed word: The SERIOL model and selective literature review. Psychonomic Bulletin & Review, 8(2), 221–243.CrossRefGoogle Scholar
  79. Yao-N’Dré, M., Castet, E., & Vitu, F. (2013). The Optimal Viewing Position effect in the lower visual field. Vision Research, 76, 114–123. doi: 10.1016/j.visres.2012.10.018 CrossRefPubMedGoogle Scholar
  80. Yun, K., Peng, Y., Samaras, D., Zelinsky, G. J., & Berg, T. L. (2013). Exploring the role of gaze behavior and object detection in scene understanding. Frontiers in Psychology, 4, 1–14. doi: 10.3389/fpsyg.2013.00917 CrossRefGoogle Scholar

Copyright information

© The Psychonomic Society, Inc. 2015

Authors and Affiliations

  1. 1.Laboratoire de Psychologie Cognitive (LPC), UMR 7290CNRS, Aix-Marseille UniversitéMarseille cedex 03France

Personalised recommendations