Describing and generating multimodal contents featuring affective lifelike agents with MPML
- 103 Downloads
- 7 Citations
Abstract
In this paper, we provide an overview of our research on multimodal media and contents using embodied lifelike agents. In particular we describe our research centered on MPML (Multimodal Presentation Markup Language). MPML allows people to write and produce multimodal contents easily, and serves as a core for integrating various components and functionalities important for multimodal media. To demonstrate the benefits and usability of MPML in a variety of environments including animated Web, 3D VRML space, mobile phones, and the physical world with a humanoid robot, several versions of MPML have been developed while keeping its basic format. Since emotional behavior of the agent is an important factor for making agents lifelike and for being accepted by people as an attractive and friendly human-computer interaction style, emotion-related functions have been emphasized in MPML. In order to alleviate the workload of authoring the contents, it is also required to endow the agents with a certain level of autonomy. We show some of our approaches towards this end.
Keywords
Lifelike Agent Multimodal Contents Content Description Language Emotion Affective ComputingPreview
Unable to display preview. Download preview PDF.
References
- 1).Allen, J., et al., “Towards Conversational Human-Computer Interaction,”AI Magazine, Vol. 22, No. 4, pp. 27–38, 2001.Google Scholar
- 2).A.L.I.C.E. Artificial Intelligence Foundation, URL: http://;www.alicebot.org/.Google Scholar
- 3).Arafa, Y., et al., “Two Approaches to Scripting Character Animation,” inProc. AAMAS-02 Workshop on ECA — Let’s Specify and Evaluate Them!, Bologna, Italy.Google Scholar
- 4).Badler, N.I., et al., “Parameterized Action Representation for Virtual Human Agents,” inEmbodied Conversational Agents (Cassell, J., et al. (eds.)), pp. 256–284. The MIT Press, 2000.Google Scholar
- 5).Ball, G. and Breese, J., “Emotion and Personality in a Conversational Agent,” inEmbodied Conversational Agents (J. Cassell, et al. (eds.)), pp. 189–219, The MIT Press, 2000.Google Scholar
- 6).Barakonyi, I. and Ishizuka, M., “A 3D Agent with Synthetic Face and Semiautonomous Behavior for Multimodal Presentations,”Proc. Multimedia Technology and Applications Conference (MTAC’01, IEEE Computer Soc.), pp. 21–25, Irvine, California, USA, 2001.Google Scholar
- 7).Becker, C., Prendinger, H., Ishizuka, M. and Wachsmuth, I., “Evaluating Affective Feedback of the 3D Agent Max in a Competitive Cards Game,” inProc. First Int’l Conf. on Affective Computing and Intelligent Interaction (ACII’05) (Tao, J., Tan T. and Picard, R.W. eds.),LNCS 3784, Springer, Beijing, China, pp. 466–473, 2005.CrossRefGoogle Scholar
- 8).Bollegala, D., Okazaki, N. and Ishizuka, M., “A Machine Learning Approach to Sentence Ordering for Multidocument Summarization and its Evaluation,” inProc. of 2nd Int’l Joint Conf. on Natural Language Processing (IJCNLP’05) (Dale, R., Wong, K.-F., Su, J. and Kwong, O.Y. (eds.)),LNAI 3651, Springer, Jeju Island, Korea, pp. 624–635, 2005.Google Scholar
- 9).Cassell, J., Sullivan, J., Prevost, S. and Churchill, E. (eds.),Embodied Conversational Agents, The MIT Press, 2000.Google Scholar
- 10).Cassell, J., Vilhjalmsson, H. and Bickmore, T., “BEAT: The Behavior Expression Animation Toolkit,” inProc. SIGGRAPH-01, pp. 477–486, 2001.Google Scholar
- 11).DeCarolis, B., Carofiglio, V., Bilvi, M. and Pelachaud, C., “APML: a Mark-up Language for Believable Behavior Generation,” inProc. AAMAS’02 Workshop on ECA — Let’s Specify and Evaluate Them!, Bologna, Italy, 2002.Google Scholar
- 12).Descamps, S. and Ishizuka, M., “Bringing Affective Behavior to Presentation Agents,” inProc. 3rd Int’l Workshop on Multimedia Network Systems (MNS2001) (IEEE Computer Soc.), pp. 332–336, Mesa, Arizona, 2001.Google Scholar
- 13).Descamps, S., Prendinger, H. and Ishizuka, M., “A Multimodal Presentation Mark-up Language for Enhanced Affective Presentation,” inAdvances in Education Technologies: Multimedia. WWW and Distant Education, inProc. Int’l Conf. on Intelligent Multimedia and Distant Learning (ICIMADE’01), pp. 9–16, Fargo, North Dakota, USA, 2001.Google Scholar
- 14).Descamps, S., Barakonyi, I. and Ishizuka, M., “Making the Web Emotional: Authoring Multimodal Presentations Using a Synthetic 3D Agent,” inProc. OZCHI’01 (Computer-Human Interaction, SIG of Australia), pp. 25–30, Perth, Australia, 2001.Google Scholar
- 15).Ekman, P., Friesen, W.V. and Hager, JC.,The Facial Action Coding System, 2nd ed., Weidenfeld & Nicolson, London, 2002.Google Scholar
- 16).Fellbaum, C.,WordNet: An Electronic Lexical Database, The MIT Press, 1982.Google Scholar
- 17).Hayes-Roth, B., “What Makes Characters Seem Life-like?,” inLife-Like Characters (Prendinger, H. and Ishizuka, M. (eds.)), pp. 447–462, Springer-Verlag, 2004.Google Scholar
- 18).Huang, Z., Eliens, A. and Visser, C., “STEP: a Scripting Language for Embodied Agent,” inProc. PRICAI’02 Workshop on Lifelike Animated Agent — Tools, Affective Functions and Applications, Tokyo, 2002.Google Scholar
- 19).Jatowt, A. and Ishizuka, M., “Summarization of Dynamic Content in Web Collections,” inProc. 8th European Conf. on Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD’04), Lecture Notes in Artificial Intelligence, LNAI 3202, pp. 245–254, Springer, Pisa, Italy, (2004)Google Scholar
- 20).Kipp, M., “From Human Gesture to Synthetic Action,” inProc. 5th Int’l Conf. on Autonomous Agents, pp. 9–14, Montreal, 2001.Google Scholar
- 21).Krenn, B. and Pirker, H., “Defining the Gesticon: Language and Gesture Coordination for Interacting Embodied Agents,” inProc. AISB’04 Symp. on Language, Speech and Gesture for Expressive Characters, pp. 107–115, Univ. of Leeds, UK, 2004.Google Scholar
- 22).Kushida, K., Nishimura, Y., Dohi, H., Ishizuka, M., Takeuchi, J. and Tsujino, H., “Humanoid Robot Presentation through Multimodal Presentation Markup Language MPML-HR,”Proc. AAMAS’05 Workshop 13, Creating Bonds with Humanoids, pp. 23–29, Utrecht, The Netherlands, 2005.Google Scholar
- 23).Lang, P.J., “The Emotion Probe: Studies of Motivation and Attention,”American Psychologist, Vol. 50, No. 5, pp. 372–385, 1995.CrossRefGoogle Scholar
- 24).Lester, J., et al., “The Persona Effect: Affective Impact of Animated Pedagogical Agents,” inProc. CHI-97, pp. 359–666, Atlanta, Georgia, 1997.Google Scholar
- 25).Liu, H., Lieberman, H. and Selker, T., “A Model of Textual Affect Sensing Using Real-World Knowledge,”Proc. Int’l Conf. on Intelligent User Interfaces (IUI’03), pp. 125–132, Miami, Florida, 2003.Google Scholar
- 26).Ma, C., Prendinger, H. and Ishizuka, M., “Eye Movement as an Indicator of Users’ Involvement with Embodied Interfaces at the Low Level,”Proc. Symposium on Conversational Informatics for Supporting Social Intelligence & Interaction: Situational and Environmental Information Enforcing Involvement in Conversation (AISB’05), pp. 136–143, Hatfield, UK, 2005.Google Scholar
- 27).Ma, C., Prendinger, H. and Ishizuka, M., “Emotion Estimation and Reasoning Based on Affective Textual Interaction,” inProc. First Int’l Conf. on Affective Computing and Intelligent Interaction, First Int’l Conf. ACII ’05 (J. Tao, T. Tan and R. W. Picard (eds.)),LNCS 3784, Springer, pp. 622–628, Beijing, China, 2005.Google Scholar
- 28).Marriott, A. and Stallo, J., “VHML — Uncertainties and Problems: A Discussion,” inProc. AAMAS’02 Workshop on ECA — Let’s Specify and Evaluate Them!, Bologna, Italy, 2002.Google Scholar
- 29).Masum, S.M.A., Ishizuka, M. and Islam, Md.T., “‘Auto-Presentation’: A Multi-Agent System for Building Automatic Multi-Modal Presentation of a Topic from World Wide Web Information,”Proc. 2005 IEEE/WIC/ACM Int’l Conf. on Intelligent Agent Technology (WI/IAT2005), pp. 246–249, Compiegne, France, 2005.Google Scholar
- 30).McCrae, R.R. and John, O.P., “An Introduction to the Five Factor Model and its Applications,”Journ. of Personality, Vol. 60, pp. 175–215, 1992.CrossRefGoogle Scholar
- 31).Mehrabian, A.,Nonverval Communication, Aldin-Atherton, Chicago, 1971.Google Scholar
- 32).Microsoft:Developing for Microsoft Agent, Microsoft Press, 1998.Google Scholar
- 33).MIT Media Lab, “Open Mind Common Sense,” http://commonsense. media. mit. edu/, 2005.Google Scholar
- 34).Mori, J., Prendinger, H. and Ishizuka, M., “Evaluation of an Embodied Conversational Agent with Affective Behavior,” inProc. AAMAS’03 Workshop (W10) —Embodied Conversational Characters as Individuals, pp. 58–61, Melbourne, Australia, 2003.Google Scholar
- 35).Mori, K., Jatowt, A. and Ishizuka, M., “Enhancing Conversational Flexibility in Multimodal Interactions with Embodied Lifelike Agents,” inProc. Int’l Conf. on Intelligent User Interfaces (IUl’03), pp. 270–272, ACM Press, Miami, Florida, USA, 2003.Google Scholar
- 36).Murray, I.R. and Arnott, J.L., “Implementation and Testing of a System for Producing Emotion-by-Rule in Synthetic Speech,”Speech Communication, Vol. 16, pp. 369–390, 1995.CrossRefGoogle Scholar
- 37).Nakano, Y.I., et al., “Converting Text into Agent Animation: Assigning Gestures to Text,” inProc. Human Language Tech. Conf. of North America Chapter of ACL (HLTNAACL’04), pp. 153–156, 2004.Google Scholar
- 38).Nozawa, Y., Dohi, H., Iba, H. and Ishizuka, M., “Humanoid Robot Presentation Controlled by Multimodal Presentation Markup Language MPML,” inProc. 13th IEEE Int’l Workshop on Robot and Human Interactive Communication. (RO-MAN’04), No. 026, Kurashiki, Japan, 2004.Google Scholar
- 39).Okazaki, N., Aya, S., Saeyor, S. and Ishizuka, M., “A Multimodal Presentation Markup Language MPML-VR for a 3D Virtual Space,”in Workshop Proc. (CD-ROM) on Virtual Conversational Characters: Applications. Methods, and Research Challenges (in conjunction with HF’02 and OZCHI’02), 4 pages, Melbourne, Australia, 2002.Google Scholar
- 40).Okazaki, N, Matsuo, Y. and Ishizuka, M., “TISS: An Integrated Summarization System for TSC-3,” inWorking Notes of the Fourth NTCIR Workshop Meeting (NTCIR-4), pp. 436–443, Tokyo, Japan, 2004.Google Scholar
- 41).Okazaki, N., Matsuo, Y. and Ishizuka, M., “Improving Chronological Sentence Ordering by Precedence Relation,” inProc. 20th Int’l Conf. on Computational Linguistics (COLING’04), pp. 750–756, Geneva, Swiss, 2004.Google Scholar
- 42).Okazaki, N., Saeyor, S., Dohi, H. and Ishizuka, M., “An Extention of the Multimodal Presentation Markup Language (MPML) to a Tree-dimensional VRML Space,”Systems and Computers in Japan, Vol.36, No. 14, pp. 69–80, Wiley Periodicals Inc., 2005.CrossRefGoogle Scholar
- 43).Ortony, A., Clore, G.L. and Collins, A.,The Cognitive Structure of Emotions, Cambridge Univ. Press, 1988.Google Scholar
- 44).Piwek, P., et al., “RRL: A Rich Representation Language for the Description of Agent Behavior in NECA,” inProc. AAMAS-02 Workshop on ECA-Let’s Specify and Evaluate Them!, Bologna, Italy, 2002.Google Scholar
- 45).Prendinger, H. and Ishizuka, M., “Social Role Awareness in Animated Agents,” inProc. 5th International Conf. on Autonomous Agents (Agent’01), pp. 270–277, Montreal, Canada, 2001.Google Scholar
- 46).Prendinger, H. and Ishizuka, M., “Agents That Talk Back (Sometimes): Filter Programs for Affective Communication”, inProc. 2nd Workshop on Attitude, Personality and Emotions in User-Adapted Interaction, in conjunction with User Modeling 2001, 6 pages, Sonthofen, Germany, 2001.Google Scholar
- 47).Prendinger, H. and Ishizuka, M., “Let’s Talk! Socially Intelligent Agents for Language Conversation Training,”IEEE Trans. on System, Man and cybernetics, Part A, Vol, 31, Issue 5, pp. 465–471, 2001.CrossRefGoogle Scholar
- 48).Prendinger, H. and Ishizuka, M., “SCREAM: Scripting Emotion-based Agent Minds,” inProc. 1st Int’l Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS-02), pp. 350–351, Bologna, Italy, 2002.Google Scholar
- 49).Prendinger, H., Descamps, S. and Ishizuka, M., “Scripting the Bodies and Minds of Life-like Characters,” inPRICAI 2002: Trends in Artificial Intelligence (Proc. 7th Pacific Rim Int’l Conf. on Al, Tokyo) (Ishizuka, M., Sattar, A. (eds.)),LNAI 2417, pp. 571–580, Springer, 2002.Google Scholar
- 50).Prendinger, H., Descamps, S. and Ishizuka, M., “Scripting Affective Communication with Life-like Characters in Web-based Interaction Systems,”Applied Artificial Intelligence, Vol. 16, No. 7–8, pp. 519–553, 2002.CrossRefGoogle Scholar
- 51).Prendinger, H., Mayer, S., Mori J. and Ishizuka, M., “Personal Effect Revisited: Using Bio-signals to Measure and Reflect the Impact of Character-based Interfaces,” inProc. 4th Int’l Working Conf. on Intelligent Virtual Agents (IVA-03) pp. 283–291, Kloster Irsee, Germany, (Springer, Berlin Heidelberg 2003), 2003.Google Scholar
- 52).Prendinger, H. and Ishizuka, (eds.),Life-Like Characters—Tools, Affective Functions and Applications, Cognitive Technologies Series, Springer-Verlag, 2004.Google Scholar
- 53).Prendinger, H., Mori, J., Saeyor, S., Mori, K., Okazaki, N., Juli, Y., Mayer, S., Dohi, H. and Ishizuka, M., “Scripting and Evaluating Affective Interactions with Embodied Conversational Agents,”KI Zeitschrift (German Journal of Artificial Intelligence), Vol.1, pp. 4–10, 2004.Google Scholar
- 54).Prendinger, H., Descamps, S, and Ishizuka, M., “MPML: A Markup Language for Controlling the Behavior of Life-like Characters,”Journal of Visual Languages and Computing, Vol.15, No. 2, pp. 183–203, 2004.CrossRefGoogle Scholar
- 55).Prendinger, H., Dohi, H., Wang, H., Mayer S. and Ishizuka, M., “Empathic Embodied Interfaces: Addressing Users’ Affective State,” inProc. Tutorial and Research Workshop on Affective Dialogue Systems (ADS’04), pp. 53–64, Kloster Irsee, Germany, 2004.Google Scholar
- 56).Prendinger, H., Mori, J. and Ishizuka, M., “Using Human Physiology to Evaluate Subtle Expressivity of a Virtual Quizmaster in a Mathematical Game,”Int’l Journal of Human-Computer Studies, Vol.62, pp. 231–245, 2005.CrossRefGoogle Scholar
- 57).Prendinger, H. and Ishizuka, M., “The Empathic Companon: A Character-based Interface that Addresses User’s Affective States,”Int’l Journal of Applied Artificial Intelligence, Vol.19, No. 3-4, pp. 267–285, 2005.CrossRefGoogle Scholar
- 58).Prendinger, H. and Ishizuka, M., “Human Physiology as a Basis for Designing and Evaluating Affective Communication with Life-Like Characters (Invited Paper)”,IEICE Trans. on Information and Systems, (Special Section on Life-like Agent and its Communication), Vol. E88-D, No. 11, pp. 2453–2460, 2005.Google Scholar
- 59).Reeves, B. and Nass, C.,Media Equation: How People Treat Computers, Television, and New Media like Real People and Places, Univ. of Chicago Press, 1996.Google Scholar
- 60).Ruttkay Z., and Pelachaud C., (eds.),From Brows to Trust—Evaluating Embodied Conversational Agents, Kluwer Academic Pub, 2004.Google Scholar
- 61).Saeyor, S., Binda, H. and Ishizuka, M., “Visual Authoring Tool for Presentation Agent Based on Multimodal Presentation Markup Language,” inProc. Information Visualization (IV’01), pp. 563–567, London, England, 2001.Google Scholar
- 62).Saeyor, S., Uchiyama, K. and Ishizuka, M., “Multimodal Presentation Markup Language on Mobile Phones,” inProc. AAMAS’03 Workshop (W10)—Embodied Conversational Characters as Individuals, pp. 68–71, Melbourne, Australia, 2003.Google Scholar
- 63).Saeyor, S., Mukherjee, S., Uchiyama, K. and Ishizuka, M., “A Scripting Language for Multimodal Presentation on Mobile Phones,” inIntelligent Virtual Agents (Rist, T., Aylett, R., Ballin, D., Rickel, J. (eds.)), (inProc. 4th Int’l Workshop, IVA’03, Kloster Irsee, Germany), 2003.Google Scholar
- 64).Shaikh, M., Ishizuka, M. and Islam, Md.T., “‘Auto-Presentation’: A Multi-Agent System for Building Automatic Multi-Modal Presentation of a Topic from World Wide Web Information,” inProc. 2005 IEEE/WIC/ACM Int’l Conf. on Intelligent Agent Technology (WI/IAT’05), pp. 246–249, Compiegne, France, 2005Google Scholar
- 65).Smid, K., Pandzic, I.S. and Radman, V., “Automatic Content Production for an Autonomous Speaker Agent,” inProc. AISB’05 Symp. on Conversational Informatics for Supporting Social Intelligence, pp. 103–112, Hatfield, UK, 2005.Google Scholar
- 66).Synchronized Multimedia Integration Language, URL: http://www.w3.org/AudioVideo.Google Scholar
- 67).Stock, O. and Zancanao, M. (eds.),Multimodal Intelligent Information Presentation, Springer, 2005.Google Scholar
- 68).Stone, M., et al., “Speaking with Hand: Creating Automated Animated Conversational Characters from Recordings of Human Performance,”ACM Trans. Graphics (SIGGRAPH),Vol. 23,No. 3, 2004.Google Scholar
- 69).Tsutsui, T., Saeyor, S. and Ishizuka, M., “MPML: A Multimodal Presentation Markup Language with Character Agent Control Functions,” inProc. (CD-ROM) WebNet 2000 World Conf. on the WWW and Internet, San Antonio, Texas, USA, 2000.Google Scholar
- 70).Valitutti, A., Strapparava, C. and Stock, O., “Developing Affective Lexical Resources,”Psychology Jour., Vol, 2, No. 1, pp. 61–83, 2004. 71.Google Scholar
- 71).Zong, Y., Dohi, H. and Ishizuka, M., “Multimodal Presentation Markup Language supporting Emotion Expression,” inProc. (CD-ROM) Workshop on Multimedia Computing on the World Wide Web/ (MCWWW2000), Seattle, 2000.Google Scholar
- 72).Picard, R.,Affective Computing, The MIT Press, 2000.Google Scholar