Psychonomic Bulletin & Review

, Volume 15, Issue 6, pp 1064–1071 | Cite as

Tracking the time course of phonetic cue integration during spoken word recognition

  • Bob McMurrayEmail author
  • Meghan A. Clayards
  • Michael K. Tanenhaus
  • Richard N. Aslin
Brief Reports


Speech perception requires listeners to integrate multiple cues that each contribute to judgments about a phonetic category. Classic studies of trading relations assessed the weights attached to each cue but did not explore the time course of cue integration. Here, we provide the first direct evidence that asynchronous cues to voicing (/b/ vs. /p/) and manner (/b/ vs. /w/) contrasts become available to the listener at different times during spoken word recognition. Using the visual world paradigm, we show that the probability of eye movements to pictures of target and of competitor objects diverge at different points in time after the onset of the target word. These points of divergence correspond to the availability of early (voice onset time or formant transition slope) and late (vowel length) cues to voicing and manner contrasts. These results support a model of cue integration in which phonetic cues are used for lexical access as soon as they are available.


Formant Transition Speech Perception Word Pair Lexical Access Voice Onset Time 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Allen, J. S., & Miller, J. L. (1999). Effects of syllable-initial voicing and speaking rate on the temporal characteristics of monosyllabic words. Journal of the Acoustical Society of America, 106, 2031–2039.CrossRefPubMedGoogle Scholar
  2. Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. (1998). Tracking the time course of spoken word recognition using eye movements: Evidence for continuous mapping models. Journal of Memory & Language, 38, 419–439.CrossRefGoogle Scholar
  3. Andruski, J. E., Blumstein, S. E., & Burton, M. (1994). The effect of subphonetic differences on lexical access. Cognition, 52, 163–187.CrossRefPubMedGoogle Scholar
  4. Crosswhite, K., Masharov, M., McDonough, J., & Tenenhaus, M. K. (2008). Phonetic cues to word length in the online processing of onset-embedded word pairs. Manuscript submitted for publication.Google Scholar
  5. Dahan, D., Magnuson, J. S., & Tanenhaus, M. K. (2001). Time course of frequency effects in spoken-word recognition: Evidence from eye movements. Cognitive Psychology, 42, 317–367.CrossRefPubMedGoogle Scholar
  6. Fowler, C. A. (1984). Segmentation of coarticulated speech in perception. Perception & Psychophysics, 36, 359–368.CrossRefGoogle Scholar
  7. Gaskell, M. G., Quinlan, P. T., Tamminen, J., & Cleland, A. A. (2008). The nature of phoneme representation in spoken word recognition. Journal of Experimental Psychology: General, 137, 282–302.CrossRefGoogle Scholar
  8. Gow, D. W., Jr. (2003). Feature parsing: Feature cue mapping in spoken word recognition. Perception & Psychophysics, 65, 575–590.CrossRefGoogle Scholar
  9. Kessinger, R. H., & Blumstein, S. E. (1998). Effects of speaking rate on voice onset time and vowel production: Some implications for perception studies. Journal of Phonetics, 26, 117–128.CrossRefGoogle Scholar
  10. Kingston, J. (2005). Ears to categories: New 4arguments for autonomy. In S. Frota, M. Vigario, & M. J. Freitas (Eds.), Prosodies: With special reference to Iberian language (p. 177–222). Berlin: Mouton de Gruyter.Google Scholar
  11. Klatt, D. (1980). Software for a cascade/parallel synthesizer. Journal of the Acoustical Society of America, 67, 971–995.CrossRefGoogle Scholar
  12. Magnuson, J. S., Dixon, J. A., Tanenhaus, M. K., & Aslin, R. N. (2007). The dynamics of lexical competition during spoken word recognition. Cognitive Science, 31, 133–156.CrossRefPubMedGoogle Scholar
  13. McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 1–86.CrossRefPubMedGoogle Scholar
  14. McMurray, B. (2008). KlattWorks: A (somewhat) new systematic approach to formant-based speech synthesis for empirical research. Manuscript in preparation.Google Scholar
  15. McMurray, B., Tanenhaus, M. K., & Aslin, R. N. (2002). Gradient effects of within-category phonetic variation on lexical access. Cognition, 86, B33-B42.CrossRefPubMedGoogle Scholar
  16. McMurray, B., Tanenhaus, M. K., & Aslin, R. N. (in press). Withincategory VOT affects recovery from “lexical” garden paths: Evidence against phoneme-level inhibition. Journal of Memory & Language.Google Scholar
  17. Miller, J. L., & Dexter, E. R. (1988). Effects of speaking rate and lexical status on phonetic perception. Journal of Experimental Psychology: Human Perception & Performance, 14, 369–378.Google Scholar
  18. Miller, J. L., & Liberman, A. M. (1979). Some effects of later-occurring information on the perception of stop consonant and semivowel. Perception & Psychophysics, 25, 457–465.CrossRefGoogle Scholar
  19. Miller, J. L., & Volaitis, L. E. (1989). Effect of speaking rate on the perceptual structure of a phonetic category. Perception & Psychophysics, 46, 505–512.CrossRefGoogle Scholar
  20. Miller, J. L., & Wayland, S. C. (1993). Limits on the limitations of context-conditioned effects in the perception of [b] and [w]. Perception & Psychophysics, 54, 205–210.CrossRefGoogle Scholar
  21. Miller, J. O., Patterson, T., & Ulrich, R. (1998). Jackknife-based method for measuring LRP onset latency differences. Psychophysiology, 35, 99–115.CrossRefPubMedGoogle Scholar
  22. Oden, G. C., & Massaro, D. W. (1978). Integration of featural information in speech perception. Psychological Review, 85, 172–191.CrossRefPubMedGoogle Scholar
  23. Repp, B. H. (1982). Phonetic trading relations and context effects: New experimental evidence for a speech mode of perception. Psychological Bulletin, 92, 81–110.CrossRefPubMedGoogle Scholar
  24. Salverda, A. P., Dahan, D., & McQueen, J. M. (2003). The role of prosodic boundaries in the resolution of lexical embedding in speech comprehension. Cognition, 90, 51–89.CrossRefPubMedGoogle Scholar
  25. Shao, J., & Tu, D. (1995). The jackknife and bootstrap. New York: Springer.CrossRefGoogle Scholar
  26. Shinn, P. C., Blumstein, S. E., & Jongman, A. (1985). Limitations of context conditioned effects in the perception of [b] and [w]. Perception & Psychophysics, 38, 397–407.CrossRefGoogle Scholar
  27. Summerfield, Q. (1981). Articulatory rate and perceptual constancy in phonetic perception. Journal of the Acoustical Society of America, 7, 1074–1095.Google Scholar
  28. Tanenhaus, M. K., Spivey-Knowlton, M. J., Eberhard, K. M., & Sedivy, J. C. (1995). Integration of visual and linguistic information in spoken language comprehension. Science, 268, 1632–1634.CrossRefPubMedGoogle Scholar
  29. Wayland, S. C., Miller, J. L., & Volaitis, L. E. (1994). The influence of sentential speaking rate on the internal structure of phonetic categories. Journal of the Acoustical Society of America, 95, 2694–2701.CrossRefPubMedGoogle Scholar

Copyright information

© Psychonomic Society, Inc. 2008

Authors and Affiliations

  • Bob McMurray
    • 3
    Email author
  • Meghan A. Clayards
    • 1
  • Michael K. Tanenhaus
    • 2
  • Richard N. Aslin
    • 2
  1. 1.University of YorkYorkEngland
  2. 2.University of RochesterRochester, New YorkUSA
  3. 3.Department of Psychology, E11 SSHUniversity of IowaIowa City

Personalised recommendations