Animating speech: an automated approach using speech synthesised by rules

Hill, David R.; Pearce, Andrew; Wyvill, Brian

doi:10.1007/BF01914863

Animating speech: an automated approach using speech synthesised by rules

Published: March 1988

Volume 3, pages 277–289, (1988)
Cite this article

The Visual Computer Aims and scope Submit manuscript

David R. Hill¹,
Andrew Pearce¹ &
Brian Wyvill¹

111 Accesses
49 Citations
Explore all metrics

Abstract

This paper is concerned with the problem of animating computer drawn images of speaking human characters, and particularly with the problem of reducing the cost of adequate lip synchronisation. Since the method is based upon the use of speech synthesis by rules, extended to manipulate facial parameters, and there is also a need to gather generalised data about facial expressions associated with speech, these problems are touched upon as well. Useful parallels can be drawn between the problems of speech synthesis and those of facial expression synthesis. The paper outlines the background to the work, as well as the problems and some approaches to solution, and goes on to describe work in progress in the authors' laboratories that has resulted in one apparently successful approach to low-cost animated speaking faces. Outstanding problems are noted, the chief ones being the difficulty of selecting and controlling appropriate facial expression categories: the lack or naturalness of the synthetic speech; and the need to consider the body movements and speech of all characters in an animated sequence during the animation process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bergeron P, Lachapelle P (1985) Controlling facial expressions and body movements in the computer animated short “Tony de Peltrie”. Siggraph 85 Tutorial Notes, ACM, New York
Google Scholar
Boston DW (1973) Synthetic facial communication. Br J Audiology 7(1):95–101
Google Scholar
Coker CH (1976) A model of articulatory dynamics and control. Proc IEEE 64(4):452–460
Google Scholar
Condon WS, Ogston WD (1971) Speech and body motion synchrony of the speaker-hearer. In: Horton OL, Jenkins JJ (eds) The Perception of Language Merrill, Columbus, Ohio, pp 150–184
De Soete G (1987) A perceptual study of the Flury-Riedwyl faces for graphically displaying multivariate data. Int J Man-Mach Stud 25(5):549–555
Google Scholar
Dewdney AK (1986) The compleat computer caricaturist and a whimsical tour of face space. Sci Am 255(4):20–28
Google Scholar
Ekman P, Friesen W (1975) Unmasking the human face. Consulting Psychologist, Palo Alto, California
Ekman P, Friesen W (1977) Manual for the facial action coding system. Consulting Psychologist, Palo Alto, California
Flanagan JL (1972) Speech analysis, synthesis and perception. Springer, Berlin Heidelberg New York
Google Scholar
Fromkin V (1964) Lip positions in American English vowels. Language and Speech 7(3):215–225
Google Scholar
Hazard E (1971) Lipreading for the oral deaf and hard-of-hearing person. Charles C Thomas, Springfield, Illinois
Google Scholar
Hill DR (1978) A program structure for event-based speech synthesis by rules within a flexible segmental framework. Int J Man-Mach Stud 10(3):285–299
Google Scholar
Hill DR (1980) Spoken language generation and understanding by machine: a problems and applications oriented overview. In: Simon JC (ed) Spoken Language Generation and Understanding. NATO ASI Series, D. Riedel, Dordrecht, pp 3–38
Google Scholar
Hill DR, Witten IH, Jassem W (1977) Some results from a preliminary study of British English speech rhythm. 94th. Meeting Acoust Soc Am Miami (December 1977) (available as Dept Comput Sci, Univ of Calgary, Rep No 78/26/5)
Holmes JN (1961) Notes on synthesis work. Speech Transmission Laboratory Quarterly Progress Rep, Stockholm (April 1961)
Holmes JN (1979) Synthesis of natural-sounding speech using a formant synthesiser. In: Lindblom B, Ohmann S (eds) Frontiers of Speech Communication Research, Academic Press, London, pp 275–285
Google Scholar
Hubley J, Hubley F, Trudeau G (1983) A Doonesbury special. (Animated cartoon movie) Pacific Arts Video Records, Carmel, California, PAVR-537, 30 mins
Jassem W, Hill DR, Witten IH (1984) Isochrony in English speech: its statistical validity and linguistic relevance. In: Gibbon D (ed) Pattern, Process and Function in Discourse Phonology. de Gruyter, pp 203–225
Jeffers J, Barley M (1971) Speechreading (lipreading). Charles C Thomas, Springfield, Illinois, pp 72–75
Google Scholar
Jeffers J, Barley M (1979) Look, now hear this. Charles C Thomas, Springfield, Illinois, p 4
Google Scholar
Lewis JP, Parke FI (1987) Automated lip-synch and speech synthesis for character animation. In: Caroll JH, Tanner P (eds) Proc Human Factors in Computing Systems (and Graphics Interface (CHI+GI 87) (April 1987), pp 143–147
Liberman AM (1957) Some results of research on speech perception. J Acoust Soc Am 29(1):117–123
Google Scholar
Massaro DW, Cohen MM (1983) Evaluation and integration of visual and auditory information in speech perception. J Exp Psychology 9(5):753–771
Google Scholar
Massaro DW, Thompson LA, Barron B, Laren E (1986) Development changes in visual and auditory contributions to speech perception. J Exp Child Psychology 41(1):93–113
Google Scholar
Nishida S (1986) Speech recognition enhancement by lip information. In: Mantei M, Orbeton P (eds) Human Factors in Computer Systems (Proc CHI 86). Boston (April 1986) Assoc for Comput Mach, New York, pp 198–204
Google Scholar
Ordman KA, Ralli MP (1976) What people say. Alexander Graham Bell Association for the Deaf, Washington, DC
Google Scholar
Parke FI (1974) A parametric model for human faces. PhD Thesis, Dept Comput Sci, Univ of Utah (December 1974)
Parke FI (1982) Parameterized models for facial animation. IEEE Comput Graph Appl 2(9):61–68
Google Scholar
Pearce A, Wyvill B, Wyvill G, Hill DR (1986) Speech and expression: a computer solution to face animation. Proc Graphics Interface 86 Conf, Vancouver (May 1986) Canadian Information Proc Soc, Toronto, Ontario, pp 136–140
Platt SM (1980) A system for computer simulation of the human face. M.Sc. Thesis, The Moore School, Univ of Pennsylvania (August 1980)
Platt SM, Badler NI (1981) Animating facial expressions. Comput Graph 15(3):245–252
Google Scholar
Platt SM (1986) Structure-based animation of the human face. Int Rep Engineering Dept, Swarthmore College
Sumby WH, Pollack I (1954) Visual contribution to speech intelligibility is noise. J Acoust Soc Am 26(2):212–215
Google Scholar
Walther EF (1982) Lipreading. Nelson-Hall, Chicago
Google Scholar
Waters K (1987) A muscle model for animating three-dimensional facial expression. Comput Graph 21(4):17–24
Google Scholar
Weil P (1982) About face: Computergraphie synthesis and manipulation of facial imagery. MS Thesis, Massachusetts Institute of Technology, Cambridge, Massachusetts
Google Scholar
Witten IH (1982) Principles of computer speech. Academic Press, London New York Paris San Diego San Francisco Sao Paulo Sydney Tokyo Toronto
Google Scholar

Download references

Author information

Authors and Affiliations

Man-Machine Systems Laboratory, and Graphicsland, Department of Computer Science, The University of Calgary, T2N 1N4, Calgary, Alberta, Canada
David R. Hill, Andrew Pearce & Brian Wyvill

Authors

David R. Hill
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Pearce
View author publications
You can also search for this author in PubMed Google Scholar
Brian Wyvill
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hill, D.R., Pearce, A. & Wyvill, B. Animating speech: an automated approach using speech synthesised by rules. The Visual Computer 3, 277–289 (1988). https://doi.org/10.1007/BF01914863

Download citation

Issue Date: March 1988
DOI: https://doi.org/10.1007/BF01914863

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Animating speech: an automated approach using speech synthesised by rules

Abstract

Access this article

Similar content being viewed by others

Realistic Speech-Driven Facial Animation with GANs

MAGEFACE: Performative Conversion of Facial Characteristics into Speech Synthesis Parameters

Perceptual Study on Facial Expressions

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key words

Navigation

Animating speech: an automated approach using speech synthesised by rules

Abstract

Access this article

Similar content being viewed by others

Realistic Speech-Driven Facial Animation with GANs

MAGEFACE: Performative Conversion of Facial Characteristics into Speech Synthesis Parameters

Perceptual Study on Facial Expressions

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation