On the Production and the Perception of Audio-Visual Speech by Man and Machine

  • C. Benoît


Since the Fifties, several experiments have been run to evaluate the “benefit of lip-reading” on speech intelligibility, all presenting a natural face speaking at different levels of background noise: Sumby and Pollack, 1954; Neely, 1956; Erber, 1969; Binnie et al., 1974; Erber, 1975. We here present a similar experiment run with French stimuli.


Face Model Recognition Score Visible Speech Speech Intelligibility Natural Face 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Adjoudani, A. and Benoît, C. to appear, On the integration of auditory and visual parameters in an HMM-based ASR, in: Speechreading by Man and Machine, D. Stork, Ed., NATO-ASI series, Springer-Verlag (1996). Google Scholar
  2. Benoît, C, Boë, L.J., and Abry, C, 1991, The effect of context on labiality in French, Proceedings of the 2nd Eurospeech Conference, Vol. 1, 153–156, Genoa, Italy.Google Scholar
  3. Benoît, C, Lallouache, M.T., Mohamadi, T.M., and Abry, C, 1992, A set of French visemes for visual speech synthesis, in: Talking Machines: Theories, Models, and Designs, G. Bailly & C. Benoît, Eds, Elsevier Science Publishers, North-Holland, Amsterdam, 485–503.Google Scholar
  4. Benoît, C., Mohamadi, T., and Kandel, S., 1994, Efefct of phonetic context on audio-visual intelligibility of French, Journal of Speech and Hearing Research, 37, 1195–1203.Google Scholar
  5. Binnie, C.A., Montgomery, A.A., and Jackson, P.L., 1974, Auditory and visual contributions to the perception of consonants, Journal of Speech and Hearing Research, 17, 619–630.Google Scholar
  6. Cohen, M.M. & Massaro D.W., 1993, Modeling coarticulation in synthetic visual speech, Computer Animation’93, N. Magnenat-Thalmann & D. Thalmann, Eds, Springer-Verlag. Google Scholar
  7. Erber, N.R, 1969, Interaction of audition and vision in the recognition of oral speech stimuli. Journal of Speech & Hearing Research, 12, 423–425.Google Scholar
  8. Erber, N.P., 1975, Auditory-visual perception of speech. Journal of Speech & Hearing Disorders, 40, 481–492.Google Scholar
  9. Guiard-Marigny, T. and Ostry D.J., 1995, Three-dimensional visualization of human jaw motion in speech, Meeting of the Acoustical Society of America, Washington. Google Scholar
  10. Guiard-Marigny, T. Benoît, C. and Ostry, D.J., 1995, Speech intelligibility of synthetic lips and jaw, Proc. of the 13th Int. Congress of Phonetic Sciences, Vol. 3, 222–226, Stockholm, Sweden. Google Scholar
  11. Le Goff B. Guiard-Marigny, T. and Benoît, C., 1994, Real-time analysis-synthesis and intelligibility of talking faces, Proc. of the 2nd International Workshop on Speech Synthesis, 53–56, New Paltz (NY), USA. Google Scholar
  12. Le Goff, B., Guiard-Marigny, T., and Benoît, C., 1995, Read my lips… and my jaw! How intelligible are the components of a speaker’s face?, Proceedings of the 4thEurospeech Conference, Vol. 1, 291–294, Madrid, Spain.Google Scholar
  13. McGrath M., 1985, An examination of cues for visual and auso-visual speech perception using natural and computer-generated faces, Ph.D Thesis, University of Nottingham, UK.Google Scholar
  14. Neely, K.K., 1956, Effect of visual factors on the intelligibility of speech, Journal of the Acoustical Society of America, 28, 1275–1277.CrossRefGoogle Scholar
  15. Sumby, W.H., & Pollack, I., 1954, Visual contribution to speech intelligibility in noise. Journal of the Acoustical Society of America, 26, 212–215.CrossRefGoogle Scholar
  16. Summerfield, Q., MacLeod, A., McGrath, M., & Brooke, M., 1989, Lips, teeth, and the benefit of lipreading, in Handbook of Research on Face Processing, A.W. Young & H.D. Ellis, Eds, Elsevier Science Publishers, North-Holland, Amsterdam, 223–233.Google Scholar

Copyright information

© Plenum Press, New York 1996

Authors and Affiliations

  • C. Benoît
    • 1
  1. 1.Institut de la Communication Parlée Unité de Recherche Associée au CNRS N° 368INPG/ENSERG – Université STENDHALGrenobleFrance

Personalised recommendations