BEAT: the Behavior Expression Animation Toolkit

Cassell, Justine; Vilhjálmsson, Hannes Högni; Bickmore, Timothy

doi:10.1007/978-3-662-08373-4_8

Justine Cassell⁴,
Hannes Högni Vilhjálmsson⁵ &
Timothy Bickmore⁶

Part of the book series: Cognitive Technologies ((COGTECH))

581 Accesses
65 Citations

Summary

The Behavior Expression Animation Toolkit (BEAT) allows animators to input typed text that they wish to be spoken by an animated human figure, and to obtain as output appropriate and synchronized non-verbal behaviors and synthesized speech in a form that can be sent to a number of different animation systems. The non-verbal behaviors are assigned on the basis of actual linguistic and contextual analysis of the typed text, relying on rules derived from extensive research into human conversational behavior. The toolkit is extensible, so that new rules can be quickly added. It is designed to plug into larger systems that may also assign personality profiles, motion characteristics, scene constraints, or the animation styles of particular animators.

This chapter is a reprint from the Proceedings of SIGGRAPH’01, August 12–17, Los Angeles, CA (ACM Press 2001), pp. 477–486. The chapter has been adapted in style for consistency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Bibliography

Amaya, K., Bruderlin, A., Calvert, T.: Emotion from motion. In: Proceedings Graphics Interface’96 (1996) pp 222–229
Google Scholar
Badler, N., Bindiganavale, R., Allbeck, J., Schuler, W., Zhao, L., and Palmer, M.: Parameterized action representation for virtual human agents. In: Embodied Conversational Agents, ed Cassell, J., Sullivan, J., Prevost, S., Churchill, E. ( The MIT Press, Cambridge, MA 2000 ) pp 256–284
Google Scholar
Becheiraz, P., Thalmann, D.: A behavioral animation system for autonomous actors personified by emotions. In: Proceedings of the 1st Workshop on Embodied Conversational Characters (1998) pp 57–65
Google Scholar
Blumberg, B., Galyean, T.A.: Multi-level direction of autonomous creatures for real-time virtual environments. In: SIGGRAPH 95 Conference Proceedings ( ACM SIGGRAPH Addison-Wesley, Reading, MA 1995 ) pp 47–54
Chapter Google Scholar
Bodenheimer, B., Rose, C., Cohen, M.: Verbs and adverbs: Multidimensional motion interpolation. IEEE Computer Graphics and Applications 18 (5): 32–40 (1998)
Article Google Scholar
Brand, M.: Voice puppetry. In: SIGGRAPH 99 Conference Proceedings ( ACM SIGGRAPH, Addison-Wesley, Reading, MA 1999 ) pp 21–28
Chapter Google Scholar
Bregler, C., Covell, M., Slaney, M.: Video rewrite: Driving visual speech with audio. SIGGRAPH 97 Conference Proceedings (ACM SIGGRAPH, Addison-Wesley, Reading, MA 1997 ) pp 353–360
Google Scholar
Calvert, T.: Composition of realistic animation sequences for multiple human figures. In: Making Them Move: Mechanics, Control, and Animation of Articulated Figures, ed Badler, N., Barsky, B., Zeltzer, D. ( Morgan-Kaufmann, San Mateo, CA 1991 ) pp 35–50
Google Scholar
Cassell, J.: Nudge, nudge, wink, wink: Elements of face-to-face conversation for embodied conversational agents. In: Embodied Conversational Agents, ed Cassell, J., Sullivan, J., Prevost, S., Churchill, E. ( The MIT Press, Cambridge, MA 2000 ) pp 1–27
Google Scholar
Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Becket, T., Douville, B., Prevost, S., Stone, M.: Animated conversation: Rule-based generation of facial expression, gesture and spoken intonation for multiple conversational agents. In: Siggraph 94 Conference Proceedings ( ACM SIGGRAPH, Addison-Wesley, Reading, MA 1994 ) pp 413–420
Chapter Google Scholar
Cassell, J., Prevost, S.: Distribution of semantic features across speech and gesture by humans and computers. In: Proceedings of the Workshop on the Integration of Gesture in Language and Speech, Newark, DE (1996) pp 253–270
Google Scholar
Cassell, J., Torres, O., Prevost, S.: Turn taking vs. discourse structure: How best to model multimodal conversation. In: Machine Conversations, ed Wilks, Y. ( Kluwer, The Hague 1999 ) pp 143–154
Google Scholar
Chang, J.: Action Scheduling in Humanoid Conversational Agents. MS thesis in Electrical Engineering and Computer Science (MIT 1998 )
Google Scholar
Chi, D., Costa, M., Zhao, L., Badler, N.: The EMOTE model for effort and shape. In: SIGGRAPH 00 Conference Proceedings ( ACM SIGGRAPH, Addison-Wesley, Reading, MA 2000 ) pp 173–182
Chapter Google Scholar
Colburn, A., Cohen, M.F., Drucker, S.: The role of eye gaze in avatar mediated conversational interfaces. MSR-TR-2000–81 (Microsoft Research 2000 )
Google Scholar
Halliday, M.A.K.: Explorations in the Functions of Language. ( Edward Arnold, London 1973 )
Google Scholar
Hirschberg, J.: Accent and discourse context: Assigning pitch accent in synthetic Speech. In: Proceedings AAAI’90 (1990) pp 952–957
Google Scholar
Hiyakumoto, L., Prevost, S., Cassell, J.: Semantic and discourse information for text-to-speech intonation. In: Proceedings ACL Workshop on Concept-to-Speech Generation, Madrid (1997)
Google Scholar
Huang, X., Acero, A., Adcock, J., Hon, H.-W., Goldsmith, J., Liu, J., Plumpe, M.: Whistler: A trainable text-to-speech system. In: Proceedings 4th International Conference on Spoken Language Processing (ICSLP’96), Piscataway, NJ (1996) pp 2387–2390
Chapter Google Scholar
Kurlander, D., Skelly, T., and Salesin, D.: Comic chat. In: SIGGRAPH 96 Conference Proceedings, ( ACM SIGGRAPH, Addison-Wesley, Reading, MA 1996 ) pp 225–236
Chapter Google Scholar
Lenat, D.B., Guha, R.V.: Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. ( Addison-Wesley, Reading, MA 1990 )
Google Scholar
Massaro, D.W.: Perceiving Talking Faces: From Speech Perception to a Behavioral Principle. ( The MIT Press, Cambridge, MA 1987 )
Google Scholar
McNeill, D.: Hand and Mind: What Gestures Reveal about Thought. (The University of Chicago Press 1992 )
Google Scholar
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to Wordnet: An on-line lexical database (1993)
Google Scholar
Nagao, K., Takeuchi, A.: Speech dialogue with facial displays: Multimodal human-computer conversation. In: Proceedings ACL’94 (1994) pp 102–109
Google Scholar
Pearce, A., Wyvill, B., Wyvill, G., Hill, D.: Speech and expression: A computer solution to face animation. In: Proceedings Graphics Interface (1986) pp 136–140
Google Scholar
Pelachaud, C., Badler, N., Steedman, M.: Generating facial expressions for speech. Cognitive Science 20 (1): 1–46 (1994)
Article Google Scholar
Perlin, K.: Noise, hypertexture, antialiasing and gesture. In: Texturing and Modeling, A Procedural Approach, ed Ebert, D. ( AP Professional, Cambridge, MA 1994 )
Google Scholar
Perlin, K., Goldberg, A.: Improv: A system for scripting interactive actors in virtual worlds. In: Proceedings of SIGGRAPH ’96 (1996) pp 205–216
Google Scholar
Prevost, S., Steedman, M.: Specifying intonation from context for speech synthesis. Speech Communication 15: 139–153 (1994)
Article Google Scholar
Roehl, B.: Specification for a Standard Humanoid, Version 1.1,ed H.A.W. Group, http://ece.uwaterloo.ca/h-anim/specl.1/(1999)
Taylor, P., Black, A., Caley, R.: The architecture of the Festival Speech Synthesis System. In: Proceedings 3rd ESCA Workshop on Speech Synthesis ( Jenolan Caves, Australia 1998 ) pp 147–151
Google Scholar
Waters, K., Levergood, T.: An automatic lip-synchronization algorithm for synthetic faces. In: Proceedings of the 2nd ACM International Conference on Multimedia, San Francisco, CA (1994) pp 149–156
Google Scholar
Yan, H.: Paired Speech and Gesture Generation in Embodied Conversational Agents. MS thesis in the Media Lab (MIT 2000 )
Google Scholar

Download references

Author information

Authors and Affiliations

MIT Media Laboratory, 20 Ames St., E15-315, Cambridge, MA, 02139, USA
Justine Cassell
MIT Media Laboratory, 20 Ames St., E15-320R, Cambridge, MA, 02139, USA
Hannes Högni Vilhjálmsson
MIT Media Laboratory, 20 Ames St., E15-320Q, Cambridge, MA, 02139, USA
Timothy Bickmore

Authors

Justine Cassell
View author publications
You can also search for this author in PubMed Google Scholar
Hannes Högni Vilhjálmsson
View author publications
You can also search for this author in PubMed Google Scholar
Timothy Bickmore
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Information and Communication Engineering, Graduate School of Information Science and Technology, University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, 113-8656, Tokyo, Japan
Helmut Prendinger & Mitsuru Ishizuka &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cassell, J., Vilhjálmsson, H.H., Bickmore, T. (2004). BEAT: the Behavior Expression Animation Toolkit. In: Prendinger, H., Ishizuka, M. (eds) Life-Like Characters. Cognitive Technologies. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-08373-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-662-08373-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05655-0
Online ISBN: 978-3-662-08373-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics