During reading, the eyes move along lines of text in a sequence of saccadic movements separated by brief fixational pauses during which visual information is acquired. This behavior results from changes in retinal acuity, which is greatest at the point of fixation and declines sharply with increasing eccentricity (Hilz & Cavonius, 1974). Saccadic eye movements compensate for this limited acuity by producing shifts in the location of fixations so that text previously located away from the point of fixation is brought into high acuity vision.

Research on the spatial and temporal characteristics of eye movements is crucial for revealing the influence of the visual characteristics of text on when and where the eyes move during reading (e.g., Rayner, 2009), and is central to the development of models of eye movement control (e.g., Engbert, Nuthmann, Richter, & Kliegl 2005; Reichle, Rayner, & Pollatsek 2003). However, research on this topic has been conducted primarily in languages based on the Latin alphabet (e.g., English, French, German), and, while recent research has examined non-alphabetic languages like Chinese (e.g., Li, Liu, & Rayner 2011), little is known about eye movements for alphabetic languages with fundamentally different visual characteristics. Arabic is the second-most widely read alphabetic language (after English) across the globe, yet few studies have examined eye movements when reading Arabic (e.g., Roman & Pavard, 1987; Roman, Pavard, & Asseleh, 1985) and, with the exception of a recent investigation of the perceptual span (Jordan, Almabruk, et al., 2014), have not examined fundamental visual influences on eye movement control. But in addition to being read from right to left, Arabic is printed in cursive script in which individual letters generally are not well segregated, and letter size and shape can vary depending on the location within words (Ibrahim, Eviatar, & Aharon-Peretz, 2002; Jordan, Sheen, AlJassmi, & Paterson, 2015). Accordingly, research on the influence of the visual characteristics of Arabic on readers’ eye movements would extend substantially our understanding of visual influences on oculomotor control across different languages.

A particular consideration for the present research is that, in Latinate languages, word length has a major influence on where readers look and for how long. Specifically, longer words are more likely to be fixated and receive more refixations than short words (e.g., Joseph, Liversedge, Blythe, White, & Rayner, 2009; Kliegl, Grabner, Rolfs, & Engbert, 2004; Paterson, McGowan, & Jordan, 2013; Rayner & McConkie, 1976; Rayner, Slattery, Drieghe, & Liversedge, 2011; Vitu, O’Regan, Inhoff, & Topolski, 1995) and, for words fixated only once during their initial processing, the length of this fixation (single-fixation duration) is longer for longer words (Rayner, Sereno, & Raney, 1996). Moreover, when words receiving multiple fixations during initial processing are included, this total fixation time (gaze duration) is greater for longer words (e.g., Joseph et al., 2009; Juhasz & Rayner, 2003; Kliegl et al., 2004; Paterson et al., 2013; Rayner & McConkie, 1976). Taken together, these findings reveal that word length has a crucial influence on eye movement control. In particular, the finding that word length influences fixation probability indicates that cues to word length are processed parafoveally (i.e., outside central vision) and used to target forward-moving saccades. Moreover, effects of word length on refixation probability and fixation times show longer words require additional fixations (or fixation time) to process all their parts.

Additionally, readers of Latinate languages tend to initially fixate within a region between the beginning and middle of words, which Rayner (1979) termed the preferred viewing location (PVL; see also Joseph et al., 2009; McConkie, Kerr, Reddix & Zola, 1988; Paterson et al., 2013). This suggests the oculomotor system uses parafoveal cues to word length to target saccades towards specific locations in words. However, saccade accuracy is subject to random error and the range effect, which is a tendency to overshoot close targets and undershoot more distant targets (McConkie et al., 1988; but see Vitu, 1991). Consequently, not all fixations are made at the PVL, and fixations often land further to the left for longer words and to the right for short words (e.g., Joseph et al., 2009; McConkie et al., 1988; Paterson et al., 2013). Readers also tend to make shorter fixations and are more likely to refixate words when the landing position is nearer the beginning or end of words (Nuthmann, Engbert & Kliegl, 2005; Rayner et al., 1996; Vitu, McConkie, Kerr & O’Regan, 2001). However, investigations of word length effects typically have used stimuli in which letter length (number of letters) is confounded with spatial extent (McDonald, 2006), and research that has disentangled these influences suggests that, while number of letters influences fixation times for words, spatial extent primarily determines landing positions and skipping probablities, and so these components of word length separately influence eye movement control (Hautala, Hyönä, & Aro, 2011). But while robust influences of word length on these crucial aspects of eye movement behavior are observed for Latinate languages, influences of word length when reading Arabic remain to be established.

Although very different from textual reading, studies using isolated words in Latinate languages which suggest words are recognized most efficiently when fixated at the optimal viewing position (OVP; Jordan, Paterson, Kurtev, & Xu, 2010; O’Regan, 1981; O’Regan & Jacobs, 1992) are also relevant to this issue. For these languages, the OVP is between the beginning and middle letters of words and corresponds closely to the PVL. However, for Hebrew, which, like Arabic, is a Semitic language, read from right-to-left, the OVP and the PVL differ (Deutsch & Rayner, 1999). Specifically, the OVP is at the word center and the PVL to the right of center, between a word’s beginning and middle letters. For Arabic, like Hebrew, the OVP is at word center (Farid & Grainger, 1996; Jordan, Almabruk, McGowan, & Paterson, 2011), but to date, no information has been provided for the PVL for Arabic reading. Accordingly, the present research aimed to shed light on the PVL for Arabic and how parafoveal cues to word length are used to target saccades during Arabic reading.

However, Arabic text may pose specific difficulties for parafoveal processing that affect both saccade-targeting and the preprocessing of cues to the identities of upcoming words. The first difficulty is that, because Arabic text is cursive, the lack of spatial segregation for many letters in words may decrease their distinctiveness and introduce effects of visual crowding that impede word identification (e.g., Jordan, Paterson, & Almabruk, 2010; Pelli et al., 2007). Moreover, Arabic conventionally is printed in a proportional font in which letter widths vary, and substantial variation in letter size and shape, and variability in the spacing of word segments, may obscure cues to word length and boundaries between words that are used in Latinate languages to plan saccades and help establish word identities (see Schotter, Angele, & Rayner, 2012). Indeed, problems with the visual appearance of text may be exacerbated by the linguistic complexity of Arabic words, as letters that convey a word’s core meaning are dispersed throughout the word at variable locations and interposed between other letters. The second difficulty is that, because Arabic is read from right to left, words that are the targets of forward-moving saccades naturally fall in the readers’ left visual field. Because of contralateral hemispheric projections from the retina, words in the left visual field away from the point of fixation project to the brain’s right hemisphere (Jordan, Fuggetta, Paterson, Kurtev & Xu, 2011), which (for most individuals) is inferior for language (see Gazzaniga, 2000; Jordan & Paterson, 2009). Consequently, the projection of words to the right hemisphere may produce widespread problems with word recognition (e.g., Almabruk, Paterson, McGowan, & Jordan, 2011) that, combined with the visual characteristics of Arabic, may disrupt parafoveal processing of upcoming words. Indeed, previous research suggests the right hemisphere is particularly poor at identifying Arabic letters (Ibrahim et al., 2002), and this may be especially detrimental to parafoveal processing.

Accordingly, the present study investigated effects of word length on eye movements when reading Arabic text by varying the number of letters in words. Text was displayed in a proportional font in which letter widths varied. However, to provide comparability with previous research in Latinate languages that used fixed-width letters, we selected word stimuli in which these two components of word length were highly correlated (see Method). If word length influences where and for how long fixations occur in Arabic reading, different word lengths should produce different patterns of oculomotor behavior. Moreover, if word length affects reading in a way resembling that previously reported for Latinate languages and for Hebrew, longer words should be fixated more often, and receive more fixations and longer fixation times, than short words. Additionally, the PVL should tend to be between the beginning and middle letters of words (to the right of word center), modulated by word length. In contrast, if the characteristics of Arabic text distinguish this language from those previously investigated, this distinction should be apparent in the effects we observe. In either case, the findings would reveal for the first time the fundaments of eye-movement control when reading Arabic and provide a novel indication of the importance of word length in reading.

Method

Participants

Twelve fluent native Arabic readers (21–35 years) who were students at the University of Leicester were paid for participating. All had normal visual acuity, determined by a Bailey-Lovie Eye Chart, and were right handed, determined by the Revised Annett Handedness questionnaire (Annett, 1970).

Stimuli

Target words were 50, 3-, 5-, and 7-letter Arabic words selected from the Aralex database (Boudelaa & Marslen-Wilson, 2010). Words of each length were closely matched for written frequency (3 letters, mean = 49 counts/million; 5 letters, mean = 46 counts/million; 7 letters, mean = 46 counts/million; F < 1). Text was presented in a proportional font (Modern Standard Arabic), and words were selected for which letter length and spatial width were closely matched. Letters on average subtended .31°, and target words subtended the following visual angles: 3 letters, M = .94°, SE = .03°; 5 letters, M = 1.56°, SE = .02°; 7 letters, M = 2.19°, SE = .02°). The letter length and spatial extent of these words were highly correlated (r = .93, p < .01). Following previous research, one target word of each length was inserted into a neutral sentence frame that was identical up to this word, creating 150 stimulus sentences (see Fig. 1). A cloze procedure using 10 additional Arabic readers confirmed that words of each length could not be guessed on any trial and so were unpredictable in each frame. Sentences were 12–17 words long (M = 15) and presented as a single line of text.

Fig. 1
figure 1

Examples of sentences used in the experiment containing (a) 3-letter, (b) 5-letter and (c) 7-letter target words. Target words are indicated by an arrow (which was not shown in the experiment). These sentences translate into English, with the target word underlined, as follows: (a) “During the meeting the field of improving productivity and self-reliance was discussed”; (b) “During the meeting the memorandum by the secretariat on the implementation of the plan was discussed”; (c) “During the meeting the legality of using human subjects in scientific research was discussed.”

Each participant viewed all 150 sentences. Sentences were shown in three blocks, each containing one occurrence of each sentence frame, and each block contained an equal number of target words of each length. Sentences in each block were presented in a different randomized order for each participant, preceded by 10 additional Arabic sentences which served as practice items.

Apparatus and procedure

An EyeLink 1000 eye tracker recorded right-eye gaze location every millisecond. Sentences were displayed on a ViewSonic monitor as black text on a white background. Participants were instructed to read normally and for comprehension. The eye tracker was then calibrated. At the start of each trial, a fixation square was presented at the right side of the screen. Once this was fixated, a sentence was presented (displayed from right to left), with its first letter replacing the square. Participants pressed a response key on finishing reading each sentence. On 50% of trials (across the three blocks), the sentence was replaced by a comprehension question (in Arabic), to which participants responded. Calibration was checked between trials and the tracker recalibrated as necessary. Each experiment session lasted approximately 40 minutes.

Results

Prior to analyzing data, a standard procedure removed fixations under 80 ms and over 1200 ms.Footnote 1 Trials in which a blink was made during first-pass reading of target words were discarded (1.8% of trials). All participants achieved 92% or higher accuracy for comprehension questions, indicating they comprehended sentences well.

The focus of this study was target word-level eye movements, but sentence-level measures were also computed to provide normative data for Arabic reading. For target word-level measures, a one-way within-participants analysis of variance (ANOVA) with factor word length (3, 5, or 7 letters) was performed, computing error variance across participants (F 1) and sentences (F 2). Following each analysis, post hoc comparisons (Bonferroni-corrected t tests) examined effects more closely.Footnote 2

Sentence-level measures

Table 1 shows mean performance for each sentence-level measure. These eye movement behaviors were similar to those for Arabic text in previous research (Jordan et al., 2014) and so appear typical for Arabic reading.

Table 1 Mean performance for sentence-level measures

Target-word-level measures

Fixation and refixation probabilities

Table 2 shows fixation and refixation probabilities for target words. While fixation probabilities were generally high, word length affected the probability of fixating target words, F 1(2, 22) = 13.32, p < .001, ηp 2 = .55, F 2(2, 94) = 16.15, p < .001, ηp 2 = .27, which was equally higher for 7- and 5-letter words than for 3-letter words (ts > 3, ps < .01). Word length also affected refixation probabilities, F 1(2, 22) = 25.23, p < .05, ηp 2 = .70, F 2(2, 94) = 46.98, p < .01, ηp 2 = .50, which were highest for 7-letter words, lower for 5-letter words, and lowest for 3-letter words (ts > 4, ps < .05). Thus, readers generally were more likely to fixate and refixate longer words.

Table 2 Target-word level fixation probabilities and durations

Fixation times

Table 2 shows fixation times for target words. Word length did not affect the duration of first (or single) fixations, Fs < 1.7, but affected gaze durations, F 1(2, 22) = 11.97, p < .001, ηp 2 = .52, F 2(2, 94) = 15.56, p < .01, ηp 2 = .25, and total reading times, F 1(2, 22) = 7.53, p < .001, ηp 2 = .41, F 2(2, 94) = 26.00, p < .01, ηp 2 = .36. Both were longer for 7-letter words than for 3- or 5-letter words (ts > 3, ps <.05), which did not differ (ts < 2, ps > .30). Thus, readers generally fixated longer words for longer.

Effects of word length on landing positions and refixation probabilities

Table 3 shows mean landing positions on target words and the length of the preceding saccade and saccade launch site, and Fig. 2a shows the distribution of landing positions.Footnote 3 These distributions were approximately normal, although word length affected mean landing positions, F 1(2, 22) = 14.09, p < .001, ηp 2 = .56, F 2(2, 94) = 23.66, p < .01, ηp 2 = .34, which were at the center of 3-letter words but to the right of this location for 7-letter words (t > 4, ps < .01). Mean landing positions for 5-letter words were between these locations and not significantly different from either. Thus, readers generally fixated the center of short words and fixated further to the right for longer words.

Table 3 Mean word-level landing positions, saccade lengths, and saccade launch sites
Fig. 2
figure 2

Mean (a) fixation probability, (b) first-fixation duration and (c) refixation probability as a function of landing position in words.

Word length also influenced saccade length, F 1(2, 22) = 29.38, p < .001, ηp 2 = .73, F 2(2, 94) = 56.46, p < .001, ηp 2 = .55, which was shortest for 3-letter, longer for 5-letter, and longest for 7-letter words (ts > 2.8, ps < .01). The mean distance between saccade launch site and the right boundary of target words did not differ across word lengths, Fs < 2.2. This confirmed that variation in landing position across word lengths was driven primarily by differences in saccade length, not launch site, indicating the oculomotor system used parafoveal cues to word length to target progressive saccades. Figures 2b and c show the influence of landing position on first-fixation durations and refixation probabilities. These were analyzed by pooling data at beginning, center, and end locations in words (3-letter, beginning = characters 0 & 1, center = character 2, end = character 3; 5-letter, beginning = characters 0 & 1, center = characters 2 & 3; end = characters 4 & 5; 7-letter: beginning = characters 0, 1 & 2, center = characters 3 & 4, end = characters 5, 6, & 7). F 1 analyses (data were distributed too sparsely for F 2 analyses) showed fixations were shorter and refixation probabilities greater for landing positions at beginning than either center or end locations, first-fixation duration, F 1(2, 22) = 12.56, p < .001, ηp 2 = .53; refixation probability, F 1(2, 22) = 51.78, p < .001, ηp 2 = .83, with no significant interaction with word length (Fs < 2.4). The effects were broadly similar to those reported for the beginning, center, or end of words in Latinate languages, but with shorter fixations and lower refixation probabilities for landing positions at end word locations.

Figure 3 shows the launch site of saccades onto target words. Saccades from close launch sites tended to overshoot the center of 3- and 5-letter words and land nearer the center of 7-letter words. Saccades from distant launch sites tended to undershoot the center of words, and saccades from intermediate launch sites tended to land near word centers. These findings were consistent with the range effect and previous findings from Latinate languages (McConkie et al., 1988).

Fig. 3
figure 3

Mean fixation probability as a function of landing position in words for (a) close, (b) intermediate and (c) distant saccade launch sites

Discussion

The present findings provide the first evidence of effects of word length on eye movement control when reading Arabic text. In particular, longer words produced higher fixation and refixation probabilities and were fixated for longer than short words. Fixation rates were high, but eye movement behavior appeared typical for Arabic reading (Jordan et al., 2014). The indication, therefore, is that Arabic readers naturally fixate the vast majority of words in text, and this may reflect the visual and linguistic complexity of written Arabic (Ibrahim et al., 2002). The influence of word length on fixation probabilities suggests, nevertheless, that Arabic readers use parafoveal cues to word length to select which upcoming words are fixated. Thus, while the characteristics of Arabic text are very different from those of the Latinate languages that have dominated previous research, the influence of word length on whether a word is fixated is similar in Arabic to that found previously for these languages.

But while in Latinate languages word length is the strongest predictor of fixation probability (see Brysbaert, Drieghe, & Vitu, 2005), lexical frequency and word predictability also are important (Rayner et al., 1996, 2011). Indeed, current computational models allow for words that are short (or common) to be identified in parafoveal vision and skipped during reading (Engbert et al., 2005; Reichle et al., 2003). The extent to which Arabic readers use this additional information is unclear, although the visual characteristics of Arabic words and reduced acuity away from fixation may militate against the use of identity information from parafoveal locations. In particular, cursive script, reduced acuity, and the projection of upcoming words outside foveal vision to the right hemisphere are likely to produce more limited parafoveal processing for Arabic than for Latinate languages. Effects of word length on refixation probabilities and fixation times in the present experiment also reveal that, once fixated, longer words were more difficult to process. This effect on fixation times emerged in gaze durations and so likely resulted from refixations on words (Juhasz & Rayner, 2003). This, too, may be a consequence of the visual and linguistic nature and reading direction of Arabic text, which may encourage readers to engage in greater fixational processing of words to compensate for reduced parafoveal information.

Effects of word length on the landing positions on words contrasted sharply with those observed previously for Latinate languages. The PVL in Latinate languages generally is to the left of word center, but fixations often undershoot this location in long words and overshoot it in short words (Joseph et al., 2009; McConkie et al., 1988; O’Regan, 1981; Paterson et al., 2013; Rayner, 1979). However, the present study shows that landing positions in Arabic, which is read from right to left, were at the center of 3-letter words and to right of this location for 7-letter words (and between these locations for 5-letter words). This pattern resonates with that obtained previously for Hebrew (Deutsch & Rayner, 1999) which, like Arabic, is read from right to left. Consequently, the present findings provide a further indication that the direction of reading determines the PVL. Indeed, following the account presented by McConkie et al. (1988) for Latinate languages, the PVL for Arabic may be due to saccades targeted towards the center of upcoming words, undershooting this intended location in long words and overshooting it in short words, due to error and the range effect. Effects of launch site on landings positions on words in the present experiment were consistent with the range effect. Moreover, consistent with this general account, the length of saccades landing on the target word varied as a function of target word length (not launch site), indicating that parafoveal cues to word length were used to target saccades towards specific locations in words. Finally, first-fixation times were longer and refixation probabilities higher for landing positions at beginning rather than middle locations in words, consistent with effects in Latinate languages (Rayner et al., 1996; Vitu et al., 2001), but differed little for landing positions at middle and end locations. This may reflect specific patterns of mislocated fixations (Nuthmann et al., 2005) in Arabic reading, or the benefit of fixating a broad range of locations in words (Vitu et al., 2001), possibly due to the distribution of letters that convey core meaning throughout words in Arabic. What is clear, however, is that despite problems created by the visual appearance and spatial segregation of letters and words, readers of Arabic use parafoveal information about the length and location of words to target saccades, although the precise nature of this information now remains to be determined.

In sum, despite the global use of Arabic in human societies, the influence of fundamental components of Arabic text on where and for how long readers fixate has not previously been reported. The present study reveals for the first time that oculomotor behavior when reading Arabic is guided by the length of words encountered in text, and that these influences impact on which words are fixated, where they are fixated, and for how long. The indication, therefore, is that effects of word length are a widespread and fundamental component of reading and play a central role in guiding eye-movement behavior across a range of very different alphabetic systems.