Advertisement

Talking points: A modulating circle reduces listening effort without improving speech recognition

  • Julia F. StrandEmail author
  • Violet A. Brown
  • Dennis L. Barbour
Brief Report

Abstract

Speech recognition is improved when the acoustic input is accompanied by visual cues provided by a talking face (Erber in Journal of Speech and Hearing Research, 12(2), 423–425 1969; Sumby & Pollack in The Journal of the Acoustical Society of America, 26(2), 212–215, 1954). One way that the visual signal facilitates speech recognition is by providing the listener with information about fine phonetic detail that complements information from the auditory signal. However, given that degraded face stimuli can still improve speech recognition accuracy (Munhall et al. in Perception & Psychophysics, 66(4), 574–583, 2004), and static or moving shapes can improve speech detection accuracy (Bernstein et al. in Speech Communication, 44(1/4), 5–18, 2004), aspects of the visual signal other than fine phonetic detail may also contribute to the perception of speech. In two experiments, we show that a modulating circle providing information about the onset, offset, and acoustic amplitude envelope of the speech does not improve recognition of spoken sentences (Experiment 1) or words (Experiment 2), but does reduce the effort necessary to recognize speech. These results suggest that although fine phonetic detail may be required for the visual signal to benefit speech recognition, low-level features of the visual signal may function to reduce the cognitive effort associated with processing speech.

Keywords

Spoken word recognition Speech perception Cross-modal attention 

Supplementary material

13423_2018_1489_MOESM1_ESM.docx (14 kb)
ESM 1 (DOCX 13 kb)
13423_2018_1489_MOESM2_ESM.mp4 (548 kb)
ESM 2 (MP4 548 kb)

References

  1. Arnold, P., & Hill, F. (2001). Bisensory augmentation: A speechreading advantage when speech is clearly audible and intact. British Journal of Psychology, 92, 339–355.CrossRefGoogle Scholar
  2. Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3). doi: https://doi.org/10.1016/j.jml.2012.11.001
  3. Bates, D., Maechler, M., Bolker, B., Walker, S., Christensen, R., Singmann, H., … Green, P. (2014). Package lme4 [Computer software]. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from https://github.com/lme4/lme4/
  4. Bentler, R., Wu, Y.-H., Kettel, J., & Hurtig, R. (2008). Digital noise reduction: Outcomes from laboratory and field studies. International Journal of Audiology, 47(8), 447–460.CrossRefPubMedGoogle Scholar
  5. Bernstein, L. E., Auer, E. T., Jr., & Takayanagi, S. (2004). Auditory speech detection in noise enhanced by lipreading. Speech Communication, 44(1/4), 5–18.CrossRefGoogle Scholar
  6. Brysbaert, M., New, B., & Keuleers, E. (2012). Adding part-of-speech information to the SUBTLEX-US word frequencies. Behavior Research Methods, 44(4), 991–997.CrossRefPubMedGoogle Scholar
  7. Desjardins, J. L., & Doherty, K. A. (2014). The effect of hearing aid noise reduction on listening effort in hearing-impaired adults. Ear and Hearing, 35(6), 600–610.CrossRefPubMedGoogle Scholar
  8. Downs, D. W. (1982). Effects of hearing aid use on speech discrimination and listening effort. The Journal of Speech and Hearing Disorders, 47(2), 189–193.CrossRefPubMedGoogle Scholar
  9. Erber, N. P. (1969). Interaction of audition and vision in the recognition of oral speech stimuli. Journal of Speech and Hearing Research, 12(2), 423–425.CrossRefPubMedGoogle Scholar
  10. Gagné, J.-P., Besser, J., & Lemke, U. (2017). Behavioral assessment of listening effort using a dual-task paradigm: A review. Trends in Hearing, 21. doi: https://doi.org/10.1177/2331216516687287
  11. Gosselin, P. A., & Gagné, J.-P. (2011). Older adults expend more listening effort than young adults recognizing audiovisual speech in noise. International Journal of Audiology, 50(11), 786–792.CrossRefPubMedGoogle Scholar
  12. Grant, K. W., van Wassenhove, V., & Poeppel, D. (2004). Detection of auditory (cross-spectral) and auditory–visual (cross-modal) synchrony. Speech Communication, 44(1/4), 43–53.CrossRefGoogle Scholar
  13. Grant, K. W., & Walden, B. E. (1996). Evaluating the articulation index for auditory-visual consonant recognition. The Journal of the Acoustical Society of America, 100(4), 2415–2424.CrossRefPubMedGoogle Scholar
  14. Helfer, K. S., & Freyman, R. L. (2005). The role of visual speech cues in reducing energetic and informational masking. The Journal of the Acoustical Society of America, 117, 842–849.CrossRefPubMedGoogle Scholar
  15. Jordan, T. R., & Sergeant, P. (2000). Effects of distance on visual and audiovisual speech recognition. Language and Speech, 43(1), 107–124.CrossRefGoogle Scholar
  16. Kahneman, D. (1973). Attention and effort. Englewood Cliffs, NJ: Prentice Hall.Google Scholar
  17. Kalikow, D. N., Stevens, K. N., & Elliott, L. L. (1977). Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability. The Journal of the Acoustical Society of America, 61(5), 1337–1351.CrossRefPubMedGoogle Scholar
  18. Kim, J., & Davis, C. (2004). Investigating the audio–visual speech detection advantage. Speech Communication, 44(1/4), 19–30.CrossRefGoogle Scholar
  19. Kuznetsova, A., Brockhoff, P., & Christensen, R. (2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26.CrossRefGoogle Scholar
  20. Mishra, S., Lunner, T., Stenfelt, S., Rönnberg, J., & Rudner, M. (2013). Visual information can hinder working memory processing of speech. Journal of Speech, Language, and Hearing Research, 56, 1120–1132.CrossRefPubMedGoogle Scholar
  21. Munhall, K. G., Kroos, C., Jozan, G., & Vatikiotis-Bateson, E. (2004). Spatial frequency requirements for audiovisual speech perception. Perception & Psychophysics, 66(4), 574–583.CrossRefGoogle Scholar
  22. Pichora-Fuller, M. K., Kramer, S. E., Eckert, M. A., Edwards, B., Hornsby, B. W. Y., Humes, L. E., … Wingfield, A. (2016). Hearing impairment and cognitive energy: The Framework for Understanding Effortful Listening (FUEL). Ear and Hearing, 37(Suppl. 1), 5S–27S.Google Scholar
  23. Picou, E. M., & Ricketts, T. A. (2014). The effect of changing the secondary task in dual-task paradigms for measuring listening effort. Ear and Hearing, 35(6), 611–622.CrossRefPubMedGoogle Scholar
  24. Rabbitt, P. M. (1968). Channel-capacity, intelligibility and immediate memory. The Quarterly Journal of Experimental Psychology, 20(3), 241–248.CrossRefPubMedGoogle Scholar
  25. Rosenblum, L. D., Johnson, J. A., & Saldaña, H. M. (1996). Point-light facial displays enhance comprehension of speech in noise. Journal of Speech and Hearing Research, 39(6), 1159–1170.CrossRefPubMedGoogle Scholar
  26. Sarampalis, A., Kalluri, S., Edwards, B., & Hafter, E. (2009). Objective measures of listening effort: Effects of background noise and noise reduction. Journal of Speech, Language, and Hearing Research: JSLHR, 52(5), 1230–1240.CrossRefPubMedGoogle Scholar
  27. Schwartz, J.-L., Berthommier, F., & Savariaux, C. (2004). Seeing to hear better: Evidence for early audio-visual interactions in speech identification. Cognition, 93(2), B69–B78.CrossRefPubMedGoogle Scholar
  28. Sommers, M. S., & Phelps, D. (2016). Listening effort in younger and older adults: A comparison of auditory-only and auditory-visual presentations. Ear and Hearing, 37(Suppl. 1), 62S–8S.CrossRefPubMedPubMedCentralGoogle Scholar
  29. Strand, J. F., Brown, V. A., Merchant, M. M., Brown, H. E., & Smith, J. (in press). Measuring listening effort: Convergent validity, sensitivity, and links with cognitive and personality measures. Journal of Speech, Language, and Hearing Research: JSLHR.Google Scholar
  30. Sumby, W. H., & Pollack, I. (1954). Visual contributions to speech intelligibility in noise. The Journal of the Acoustical Society of America, 26(2), 212–215.CrossRefGoogle Scholar
  31. Summerfield, Q. (1979). Use of visual information for phonetic perception. Phonetica, 36, 314–331.CrossRefPubMedGoogle Scholar
  32. Tye-Murray, N., Spehar, B., Myerson, J., Sommers, M. S., & Hale, S. (2011). Cross-modal enhancement of speech detection in young and older adults: Does signal content matter? Ear and Hearing, 32(5), 650–655.CrossRefPubMedPubMedCentralGoogle Scholar
  33. Van Engen, K. J., Chandrasekaran, B., & Smiljanic, R. (2012). Effects of speech clarity on recognition memory for spoken sentences. PLoS ONE, 7(9), e43753.CrossRefPubMedPubMedCentralGoogle Scholar
  34. Van Engen, K. J., Phelps, J. E. B., Smiljanic, R., & Chandrasekaran, B. (2014). Enhancing speech intelligibility: Interactions among context, modality, speech style, and masker. Journal of Speech, Language, and Hearing Research: JSLHR, 57(5), 1908–1918.CrossRefPubMedGoogle Scholar

Copyright information

© Psychonomic Society, Inc. 2018

Authors and Affiliations

  • Julia F. Strand
    • 1
    Email author
  • Violet A. Brown
    • 1
  • Dennis L. Barbour
    • 2
  1. 1.Department of PsychologyCarleton CollegeNorthfieldUSA
  2. 2.Department of Biomedical EngineeringWashington University in St. LouisSt. LouisUSA

Personalised recommendations