Coordinating Vocal and Visual Parameters for 3D Virtual Agents

Pelachaud, Catherine; Prevost, Scott

doi:10.1007/978-3-7091-9433-1_9

Catherine Pelachaud² &
Scott Prevost³

Part of the book series: Eurographics ((EUROGRAPH))

97 Accesses

Abstract

This paper presents an implemented system for automatically producing prosodically appropriate speech and corresponding facial expressions for animated, three-dimensional agents that respond to simple database queries in a 3D virtual environment. Unlike previous text-to-facial animation approaches, the system described here produces synthesized speech and facial animations entirely from scratch, starting with semantic representations of the message to be conveyed, which are based in turn on a discourse model and a small database of facts about the modeled world.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

C. Benoit: Why synthesize talking faces? In: Proceedings of the ESCA Workshop on Speech Synthesis, pages 253–256, Autrans, 1990.
Google Scholar
N. M. Brooke: Computer graphics synthesis of talking faces. In: Proceedings of the ESC A Workshop on Speech Synthesis, Autrans, 1990.
Google Scholar
J. Cassell, C. Pelachaud, N. Badler, M. Steedman, B. Achorn, T. Becket, B. Dou-ville, S. Prevost, and M. Stone: Animated conversation: Rule based generation of facial expression, gesture and spoken intonation for multiple conversational agents. In: SIGGRAPH’94,1994.
Google Scholar
M. M. Cohen and D. W. Massaro: Modeling coarticulation in synthetic visual speech. In: D. Thalmann N. Magnenat-Thalmann (eds.): Computer Animation’93. Springer-Verlag, 1993.
Google Scholar
G. Collier: Emotional expression. Lawrence Erlbaum Associates, 1985.
Google Scholar
W. S. Condon and W. D. Osgton: Speech and body motion synchrony of the speaker-hearer. In: D. H. Horton and J. J. Jenkins (eds.): The perception of Language, pages 150–184. Academic Press, 1971.
Google Scholar
J. Davis and J. Hirschberg: Assigning intonational features in synthesized spo-ken discourse. In: Proceedings of the 26th Annual Meeting of the Association for Computational Linguistics, pages 187–193, Buffalo, 1988.
Google Scholar
S. Duncan: Some signals and rules for taking speaking turns in conversations. In Weitz (ed.): Nonverbal Communication. Oxford University Press, 1974.
Google Scholar
P. Ekman: About brows: emotional and conversational signals. In M. von Cranäch, K. Foppa, W. Lepenies, and D. Ploog (eds.): Human ethology: claims and limits of a new discipline: contributions to the Colloquium, pages 169–248. Cambridge University Press, Cambridge, England; New-York, 1979.
Google Scholar
P. Ekman and W. Friesen: Facial action coding system. Consulting Psychologists Press, 1978.
Google Scholar
D. R. Hill, A. Pearce, and B. Wyvill: Animating speech: an automated approach using speech synthesised by rules. The Visual Computer, 3: 277–289, 1988.
Article Google Scholar
J. Hirschberg: Accent and discourse context: Assigning pitch accent in synthetic speech. In: Proceedings of AAAI: 1990, pages 952–957, 1990.
Google Scholar
G. Houghton and M. Pearson: The production of spoken dialogue. In: M. Zock and G. Sabah (eds.): Advances in Natural Language Generation: An Interdisciplinary Perspective, Vol. 1. Pinter Publishers, London, 1988.
Google Scholar
S. Isard and M. Pearson: A repertoire of British English intonation contours for synthetic speech. In: Proceeding of Speech ’88,7th FASE Symposium, pages 1223–1240, Edinburgh, 1988.
Google Scholar
A. Kendon: Some relationships between body motion and Speech. In: A. W. Sieg-man and B. Pope (eds.): Studies in Dyadic Communication, pages 177–210, 1972.
Google Scholar
J. P. Lewis and F. I. Parke: Automated lip-synch and speech synthesis for character animation. CHI + GI, pages 143–147, 1987.
Google Scholar
M. Liberman and A. L. Buchsbaum: Structure and usage of current Bell Labs text to speech programs. Technical Memorandum TM 11225–850731–11, AT & T Bell Laboratories, 1985.
Google Scholar
D. W. Massaro: Speech perception by ear and eye: a paradigm for psychological inquiry. Cambridge University Press, 1989.
Google Scholar
C. Pelachaud, N. I. Badler, and M. Steedman: Linguistic issues in facial animation. In: N. Magnenat-Thalmann and D. Thalmann (eds.): Computer Animation ’91, pages 15–30. Springer-Verlag, 1991.
Google Scholar
C. Pelachaud, M. L. Viaud, and H. Yahia: Rule-structured facial animation system. In: IJCAI93, 1993.
Google Scholar
C. Pelachaud and S. Prevost: Sight and sound: generating facial expressions and spoken intonation from context. In: Proceedings of the Second ESCA Workshop on Speech Synthesis, New Paltz, NY, 1994.
Google Scholar
C. Pelachaud, C. W. A. M van Overveld and C. Seah: Modeling and animating the human tongue during speech production. In: Computer Animation ’94, Geneva, May, 1994.
Google Scholar
J. Pierrehumbert: The phonology and phonetics of English intonation. PhD Dissertation, MIT ( Dist. by Indiana University Linguistics Club, Bloomington, IN ), 1980.
Google Scholar
S. Prevost and M. Steedman: Generating contextually appropriate intonation. In: Proceedings of the 6th Conference of the European Chapter of the Association for Computational Linguistics, pages 332–340, Utrecht, 1993.
Google Scholar
S. Prevost and M. Steedman: Using context to specify intonation in speech synthesis. In: Proceedings of the 3rd European Conference of Speech Communication and Technology (EUROSPEECH), pages 2103–2106, Berlin, 1993.
Google Scholar
S. Prevost and M. Steedman: Specifying intonation from context for speech synthesis. Speech Communication, 15 (1–2), pages 139–153, 1994.
Article Google Scholar
M. Steedman: Structure and intonation. Language, pages 260–296, 1991.
Google Scholar
A. Takeuchi and K. Nagao: Communicative facial displays as a new conversational modality. In: ACM/IFIPINTERCHI ’93, Amsterdam, 1993.
Google Scholar
J. Terken: The distribution of accents in instructions as a function of discourse structure. Language and Structure, 27: 269–289, 1984.
Google Scholar
D. Terzopoulos and K. Waters: Techniques for realistic facial modelling and animation. In: N. Magnenat-Thalmann and D. Thalmann (eds.): Computer Animation ’91, pages 45–58. Springer-Verlag, 1991.
Google Scholar
R. Zacharski, A. I. C. Monaghan, D. R. Ladd, and J. Delin: BRIDGE: Basic research on intonation in dialogue generation. Technical report, HCRC: University of Edinburgh, 1993. Unpublished manuscript.
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science and Systems, University of Rome “La Sapienza”, Italy
Catherine Pelachaud
Dept. of Computer and Information Science, University of Pennsylvania, USA
Scott Prevost

Authors

Catherine Pelachaud
View author publications
You can also search for this author in PubMed Google Scholar
Scott Prevost
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Fraunhofer-Institut für Graphische Datenverarbeitung, Darmstadt, Federal Republic of Germany
Martin Göbel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pelachaud, C., Prevost, S. (1995). Coordinating Vocal and Visual Parameters for 3D Virtual Agents. In: Göbel, M. (eds) Virtual Environments ’95. Eurographics. Springer, Vienna. https://doi.org/10.1007/978-3-7091-9433-1_9

Download citation

DOI: https://doi.org/10.1007/978-3-7091-9433-1_9
Publisher Name: Springer, Vienna
Print ISBN: 978-3-211-82737-6
Online ISBN: 978-3-7091-9433-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics