Skip to main content
Log in

Usage of statistical cues for word boundary in reading Chinese sentences

  • Published:
Reading and Writing Aims and scope Submit manuscript

Abstract

The present study examined the use of statistical cues for word boundaries during Chinese reading. Participants were instructed to read sentences for comprehension with their eye movements being recorded. A two-character target word was embedded in each sentence. The contrast between the probabilities of the ending character (C2) of the target word (C12) being used as word beginning and ending in all words containing it was manipulated. In addition, by using the boundary paradigm, parafoveal overlapping ambiguity in the string C123 was manipulated with three types of preview of the character C3, which was a single-character word in the identical condition. During preview, the combination of C23′ was a legal word in the ambiguous condition and was not a word in the control condition. Significant probability and preview effects were observed. In the low-probability condition, inconsistency in the frequent within-word position (word beginning) and the present position (word ending) lengthened gaze durations and increased refixation rate on the target word. Although benefits from the identical previews were apparent, effects of overlapping ambiguity were negligible. The results suggest that the probability of within-word positions had an influence during character-to-word assignment, which was mainly verified during foveal processing. Thus, the overlapping ambiguity between parafoveal words did not interfere with reading. Further investigation is necessary to examine whether current computational models of eye movement control should incorporate statistical cues for word boundaries together with other linguistic factors in their word processing system to account for Chinese reading.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. Characters that are more frequently used as word beginning tend to form verbs when combined with other characters regardless of within-word positions. In contrast, characters that are more frequently used as word ending tend to form nouns. In the present study, C12 and C23′ were chosen from the same word classes (noun or verb). Thus, a large proportion of target words in the low-probability condition were verbs but a small proportion of target words in the high-probability condition were verbs. The difference between nouns and verbs during reading Chinese sentences has not been documented in the literature to our knowledge. We restricted the analysis to 23 frequency-matched verbs in both probability conditions and found similar results as those in the main analysis. Critically, the gaze durations in the area C123 and the target C12 were significantly longer in the low-probability condition than those in the high-probability condition [C123: 25 ms; F 1(1, 26) = 4.96, MS e = 5187, p < .05; F 2(1, 44) = 1.69, MS e = 14587, p > .20; C12: 42 ms; F 1(1, 26) = 23.43, MS e = 2983, p < .001; F 2(1, 44) = 7.48, MS e = 10080, p < .01]. There was no difference in FFD (Fs < 1.32, ps > .26) between probability conditions. However, the refixation rate in the low-probability condition was significantly higher than that in the high-probability condition on the target C12 [F 1(1, 26) = 13.09, MS e = 240, p < .01; F 2(1, 44) = 7.95, MS e = 468, p < .01] but the difference was not significant in the area C123 (Fs < 1.31, ps > .26). The results of the analysis restricted to nouns are not reliable since only 8 frequency-matched nouns can be found. In addition, we analyzed gaze durations on all two-character words in the sentences in the present study in a supplementary linear mixed-effects analysis (Baayen, 2008). Both effects of word class and positional probability were observed. Gaze durations on verbs were longer than those on nouns (t = 4.89). In addition, the higher the probability of being used as word beginning was, the longer the gaze durations (t = 2.06). In addition, the effect of positional probability was larger for verbs than that for nouns (t = 1.89). Further studies are necessary for a deep understanding of the effect of positional probability on word recognition.

  2. C3 in the identical condition was a single-character word, which is usually highly frequent and visually simple (in terms of number of strokes). Thus, C3 in the identical condition and those in the non-identical conditions were different in number of strokes and character frequency (ps < .001). However, the critical comparison in the present study was between the two non-identical conditions, in which C3 was carefully matched (ps > .99).

  3. As is shown in Table 2, numbers of strokes of C1 and C2 did not differ between the low- and high-probability conditions (ps > .58). Character frequencies of C1 did not differ between conditions (p > .96). However, C2s in the low-probability condition (M = 198.7 per 1 million, SD = 239.8, range 16.0–1222.1) were less frequent than those in the high-probability condition (M = 343.6 per 1 million, SD = 366.7, range 24.2–1589.0), p < .01. Although they differed in frequencies of occurrence in the corpus of written characters, they were all subjectively familiar characters (5.3 and 5.3, respectively for the low- and high-probability conditions, p > .49). The subjective familiarity is based on an unpublished corpus of 5640 Chinese characters. The data were collected from 160 college students by using a 7-point scale for familiarity rating (Lee, Tsai, Chan, Hsu, Hung, & Tzeng, 2007, p. 148). The subjective familiarity rating values of all characters in the critical C123 area were higher than 3.5. In addition, when the item analysis was restricted to targets with C2 frequencies lower than 100 per 1 million (33 and 22 items in the low- and high-probability conditions, respectively), similar patterns of results were obtained on FFD, GD, and refixation rate in which the ROI was fixated longer and more likely refixated in the low- than high-probability condition (C123: FFD, 8 ms, p > .32, GD, 30 ms, p > .17, refixation rate, 4.1%, p > .28; C12: FFD, 8 ms, p > .34, GD, 27 ms, p > .14, refixation rate, 4.9%, p > .17). Nevertheless, we still have to be cautious about the interpretation of the effect of positional probability.

  4. It would be informative to analyze data separately for the target C12 and the preview C3 to examine the time course of the effects of positional probability and preview type (we thank one of the reviewers for this suggestion). However, C3 was a single-character word with high skipping rate. The data points were too few to have reliable statistics. Thus, C123 was chosen as the ROI with sufficient data points to be analyzed.

References

  • Academia Sinica Taiwan. (1998). Academia Sinica balanced corpus (Version 3) [CD-ROM]. Taipei, Taiwan: Academia Sinica, Chinese Knowledge and Information Processing Group.

    Google Scholar 

  • Andrews, S. (1997). The effect of orthographic similarity on lexical retrieval: Resolving neighborhood conflicts. Psychonomic Bulletin & Review, 4, 439–461.

    Article  Google Scholar 

  • Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics using R. Cambridge, UK: Cambridge University Press.

    Google Scholar 

  • Bertram, R., Baayen, R. H., & Schreuder, R. (2000). Effects of family size for complex words. Journal of Memory and Language, 42, 390–405.

    Article  Google Scholar 

  • Bertram, R., Pollatsek, A., & Hyönä, J. (2004). Morphological parsing and the use of segmentation cues in reading Finnish compounds. Journal of Memory and Language, 51, 325–345.

    Article  Google Scholar 

  • Engbert, R., Nuthmann, A., Richter, E. M., & Kliegl, R. (2005). SWIFT: A dynamical model of saccade generation during reading. Psychological Review, 112, 777–813.

    Article  Google Scholar 

  • Findlay, J. M., & Gilchrist, I. D. (2003). Active Vision: the Psychology of Looking and Seeing. Oxford: Oxford University Press.

  • Inhoff, A. W. (1984). Two stages of word processing during eye fixations in the reading of prose. Journal of Verbal Learning and Verbal Behavior, 23, 612–624.

    Article  Google Scholar 

  • Inhoff, A. W., & Liu, W. (1998). The perceptual span and oculomotor activity during the reading of Chinese sentences. Journal of Experimental Psychology: Human Perception and Performance, 24, 20–34.

    Article  Google Scholar 

  • Inhoff, A. W., & Wu, C. (2005). Eye movements and the identification of spatially ambiguous words during Chinese sentence reading. Memory & Cognition, 33, 1345–1356.

    Article  Google Scholar 

  • Lee, C.-Y., Tsai, J.-L., Chan, W.-H., Hsu, C.-H., Hung, D. L., & Tzeng, O. J.-L. (2007). Temporal dynamics of the consistency effect in reading Chinese: an event-related potentials study. NeuroReport, 18, 147–151.

    Article  Google Scholar 

  • Li, X., Rayner, K., & Cave, K. R. (2009). On the segmentation of Chinese words during reading. Cognitive Psychology, 58, 525–552.

    Article  Google Scholar 

  • Liu, W., Inhoff, A. W., Ye, Y., & Wu, C. (2002). Use of parafoveally visible characters during the reading of Chinese sentences. Journal of Experimental Psychology: Human Perception and Performance, 28, 1213–1227.

    Article  Google Scholar 

  • McConkie, G. W., Kerr, P. W., Reddix, M. D., & Zola, D. (1988). Eye movement control during reading: I. The location of initial eye fixations on words. Vision Research, 28, 1107–1118.

    Article  Google Scholar 

  • McConkie, G. W., & Rayner, K. (1975). The span of the effective stimulus during a fixation in reading. Perception & Psychophysics, 17, 578–586.

    Article  Google Scholar 

  • McDonald, S. A., & Shillcock, R. C. (2003). Low-level predictive inference in reading: The influence of transitional probabilities on eye movements. Vision Research, 43, 1735–1751.

    Article  Google Scholar 

  • Pollatsek, A., Reichle, E. D., & Rayner, K. (2006). Tests of the E-Z Reader model: Exploring the interface between cognition and eye-movement control. Cognitive Psychology, 52, 1–56.

    Article  Google Scholar 

  • Radach, R., Reilly, R., & Inhoff, A. W. (2007). Models of oculomotor control in reading: Towards a theoretical foundation of current debates. In R. P. G. van Gompel, M. H. Fischer, W. S. Murray, & R. L. Hill (Eds.), Eye movements: A window on mind and brain (pp. 237–269). Oxford: Elsevier.

    Google Scholar 

  • Rayner, K. (1975). The perceptual span and peripheral cues in reading. Cognitive Psychology, 7, 65–81.

    Article  Google Scholar 

  • Rayner, K. (1979). Eye guidance in reading: Fixation locations within words. Perception, 8, 21–30.

    Article  Google Scholar 

  • Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372–422.

    Article  Google Scholar 

  • Rayner, K., & Duffy, S. A. (1986). Lexical complexity and fixation times in reading: Effects of word frequency, verb complexity, and lexical ambiguity. Memory & Cognition, 14, 191–201.

    Article  Google Scholar 

  • Rayner, K., Li, X., Juhasz, B. J., & Yan, G. (2005). The effect of word predictability on the eye movements of Chinese readers. Psychonomic Bulletin & Review, 12, 1089–1093.

    Article  Google Scholar 

  • Rayner, K., Li, X., & Pollatsek, A. (2007). Extending the E-Z Reader model of eye movement control to Chinese readers. Cognitive Science, 31, 1021–1033.

    Article  Google Scholar 

  • Rayner, K., & Well, A. D. (1996). Effects of contextual constraint on eye movements in reading: A further examination. Psychonomic Bulletin & Review, 3, 504–509.

    Article  Google Scholar 

  • Rayner, K., Well, A. D., & Pollatsek, A. (1980). Asymmetry of the effective visual field in reading. Perception & Psychophysics, 27, 537–544.

    Article  Google Scholar 

  • Rayner, K., Well, A. D., Pollatsek, A., & Bertera, J. H. (1982). The availability of useful information to the right of fixation in reading. Perception & Psychophysics, 31, 537–550.

    Article  Google Scholar 

  • Reilly, R. G., & Radach, R. (2006). Some empirical tests of an interactive activation model of eye movement control in reading. Cognitive Systems Research, 7, 34–55.

    Article  Google Scholar 

  • Reilly, R. G., Radach, R., Luksaneeyanawin, S., & Aranyanak, I. (2009, August). Factors involved in eye guidance in reading Thai. Paper presented at the 15th European Conference on Eye Movements, Southampton, UK.

  • Schilling, H. E. H., Rayner, K., & Chumbley, J. I. (1998). Comparing naming, lexical decision, and eye fixation times: Word frequency effects and individual differences. Memory & Cognition, 26, 1270–1281.

    Article  Google Scholar 

  • Tsai, J.-L. (2001). A multichannel PC tachistoscope with high resolution and fast display change capability. Behavior Research Methods, Instruments, & Computers, 33, 524–531.

    Article  Google Scholar 

  • Tsai, J.-L., Lee, C.-Y., Lin, Y.-C., Tzeng, O. J.-L., & Hung, D. L. (2006). Neighborhood size effects of Chinese words in lexical decision and reading. Language and Linguistics, 7, 659–675.

    Google Scholar 

  • Tsai, J.-L., Lee, C.-Y., Tzeng, O. J.-L., Hung, D. L., & Yen, N.-S. (2004). Use of phonological codes for Chinese characters: Evidence from processing of parafoveal preview when reading sentences. Brain and Language, 91, 235–244.

    Article  Google Scholar 

  • Tsai, J.-L., & McConkie, G. W. (2003). Where do Chinese readers send their eyes? In J. Hyönä, R. Radach, & H. Deubel (Eds.), The mind’s eye: Cognitive and applied aspects of eye movement research (pp. 159–176). Amsterdam, The Netherlands: Elsevier.

    Google Scholar 

  • Underwood, N. R., & McConkie, G. W. (1985). Perceptual span for letter distinctions during reading. Reading Research Quarterly, 20, 153–162.

    Article  Google Scholar 

  • Yan, M., Kliegl, R., Richter, E. M., Nuthmann, A., & Shu, H. (2010). Flexible saccade-target selection in Chinese reading. Quarterly Journal of Experimental Psychology, 63, 705–725.

    Article  Google Scholar 

  • Yan, M., Richter, E. M., Shu, H., & Kliegl, R. (2009). Readers of Chinese extract semantic information from parafoveal words. Psychonomic Bulletin & Review, 16, 561–566.

    Article  Google Scholar 

  • Yan, G., Tian, H., Bai, X., & Rayner, K. (2006). The effect of word and character frequency on the eye movements of Chinese readers. British Journal of Psychology, 97, 259–268.

    Article  Google Scholar 

  • Yang, H.-M., & McConkie, G. W. (1999). Reading Chinese: Some basic eye-movement characteristics. In J. Wang, A. W. Inhoff, & H.-C. Chen (Eds.), Reading Chinese script: A cognitive analysis (pp. 207–222). Mahwah, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Yang, J., Wang, S., Xu, Y., & Rayner, K. (2009). Do Chinese readers obtain preview benefit from word n + 2? Evidence from eye movements. Journal of Experimental Psychology: Human Perception and Performance, 35, 1192–1204.

    Article  Google Scholar 

  • Yen, M.-H., Radach, R., Tzeng, O. J.-L., Hung, D. L., & Tsai, J.-L. (2009). Early parafoveal processing in reading Chinese sentences. Acta Psychologica, 131, 24–33.

    Article  Google Scholar 

  • Yen, M.-H., Tsai, J.-L., Tzeng, O. J.-L., & Hung, D. L. (2008). Eye movements and parafoveal word processing in reading Chinese. Memory and Cognition, 36, 1033–1045.

    Article  Google Scholar 

Download references

Acknowledgments

This study was supported by the grants from Taiwan’s National Science Council (NSC96-2413-H-004-018-MY3, NSC 99-2420-H-004-002-, NSC98-2811-H-004-022, NSC 99-2811-H-004-015, and NSC 100-2410-H-003-001-). The final version of this manuscript was completed when the first author moved to the Graduate Institute of Science Education, National Taiwan Normal University. We thank two anonymous reviewers for helpful comments on earlier versions of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie-Li Tsai.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yen, MH., Radach, R., Tzeng, O.JL. et al. Usage of statistical cues for word boundary in reading Chinese sentences. Read Writ 25, 1007–1029 (2012). https://doi.org/10.1007/s11145-011-9321-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11145-011-9321-z

Keywords

Navigation