Advertisement

International Journal of Speech Technology

, Volume 3, Issue 1, pp 15–25 | Cite as

The Effect of Lexical Complexity on Intelligibility

  • Alexander L. Francis
  • Howard C. Nusbaum
Article

Abstract

Most intelligibility tests are based on the use of monosyllabictest stimuli. This constraint eliminates the ability to measurethe effects of lexical stress patterns, complex phonotacticorganizations, and morphological complexity on intelligibility.Since these aspects of lexical structure affect speechproduction (e.g., by changing syllable duration), it is likelythat they affect the structure of acoustic-phonetic patterns.Thus, to the extent that text-to-speech systems fail to modifyacoustic-phonetic patterns appropriately in polysyllabic words,intelligibility may suffer. This means that while most standardintelligibility tests may accurately estimate theintelligibility of monosyllabic words, this estimate may notgeneralize as well to predict the intelligibility of words withmore complex lexical structures. The present study was carriedout to measure how words varying in lexical complexity differ inintelligibility. Monosyllabic, bisyllabic, and polysyllabicwords were used varying in morphological complexity(monomorphemic or polymorphemic). Listeners transcribed thesestimuli spoken by two human talkers and two text-to-speechsystems varying in speech quality. The results indicate thatlexical complexity does affect the measured intelligibility ofsynthetic speech and should be manipulated in order toaccurately predict the performance of text-to-speech systemswith unrestricted natural text.

text-to-speech intelligibility assessment lexical complexity 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cole, R.A. and Rudnicky, A.I. (1983). What’s new in speech perception? The research and ideas of William Chandler Bagley, 1874–1946. Psychological Review, 90:94–101.Google Scholar
  2. Cutler, A. (1981). Making up materials is a confounded nuisance, or: Will we be able to run any psycholinguistic experiments at all in 1990? Cognition, 10:65–70.Google Scholar
  3. Grosjean, F. (1980). Spoken word recognition and the gating paradigm. Perception & Psychophysics, 28:267–283.Google Scholar
  4. Grosjean, F. (1985). The recognition of words after their acoustic offset: Evidence and implications. Perception & Psychophysics, 38:299–310.Google Scholar
  5. Grosjean, F. and Gee, J.P. (1987). Prosodic structure and spoken word recognition. Cognition, 25:135–156.Google Scholar
  6. Kirk, R.E. (1995). Experimental Design. Pacific Grove: Brooks/Cole Publishing Co.Google Scholar
  7. Kucera, H. and Francis, W.N. (1967). Computational Analysis of Present-day American English. Providence: Brown University Press.Google Scholar
  8. Lehiste, I. (1972). The timing of utterances and linguistic boundaries. Journal of the Acoustical Society of America, 51:2018–2024.Google Scholar
  9. Logan, J.S., Greene, B.G., and Pisoni, D.B. (1989). Segmental intelligibility of synthetic speech produced by rule. Journal of the Acoustical Society of America, 86:566–581.Google Scholar
  10. Luce, P.A., Feustel, T.C., and Pisoni, D.B. (1983). Capacity demands in short-term memory for synthetic and natural word lists. Human Factors, 83:17–32.Google Scholar
  11. Marlsen-Wilson, W.D. (1987). Functional parallelism in spoken word-recognition. Cognition, 25:135–156.Google Scholar
  12. Marlsen-Wilson, W.D. and Welsh, A. (1978). Processing interactions duringword-recognition in continuous speech. Cognitive Psychology, 10:29–63.Google Scholar
  13. McClelland, J.L. and Elman, J.L. (1986). The TRACE model of speech perception. In J.L. McClelland and D.E. Rumelhart (Eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Cambridge, MA: The MIT Press, pp. 58–121.Google Scholar
  14. Nusbaum, H.C. and Pisoni, D.B. (1985). Constraints on the perception of synthetic speech generated by rule. Behavior Research Methods, Instruments, & Computers, 17:235–242.Google Scholar
  15. Nusbaum, H.C., Pisoni, D.B., and Davis, C. (1984). Sizing up the Hoosier mental lexicon: Measuring the familiarity of 20,000 words. Research on Speech Perception Progress Report No. 10, Speech Research Laboratory, Department of Psychology, Indiana University, Bloomington, IN.Google Scholar
  16. Ralston, J.V., Pisoni, D.B., and Mullennix, J.W. (1995). Perception and comprehension of speech. In A.K. Syrdal, R.W. Bennett, and S.L. Greenspan (Eds.), Applied Speech Technology. Boca Raton, FL: CRC Press, pp. 233–288.Google Scholar
  17. Samuel, A.G. (1981). Phonemic restoration: Insights from a new methodology. Journal of Experimental Psychology: General, 110:474–494.Google Scholar
  18. Schmidt-Nielsen, A. (1995). Intelligibility and acceptability testing for speech technology. In A.K. Syrdal, R.W. Bennett, and S.L. Greenspan (Eds.), Applied Speech Technology. Boca Raton, FL: CRC Press, pp. 195–232.Google Scholar
  19. Slowiaczek, L.M. and Nusbaum, H.C. (1985). Effects of speech rate and pitch contour on the perception of synthetic speech. Human Factors, 27:701–712.Google Scholar
  20. Spiegel, M.F., Altom, M.J., Macchi, M., and Wallace, K.L. (1990). Comprehensive assessment of the telephone intelligibility of synthesized and natural speech. Speech Communication, 9:279–291.Google Scholar

Copyright information

© Kluwer Academic Publishers 1999

Authors and Affiliations

  • Alexander L. Francis
    • 1
  • Howard C. Nusbaum
    • 1
  1. 1.Department of PsychologyUniversity of ChicagoChicago

Personalised recommendations