Skip to main content
Log in

Describing and generating multimodal contents featuring affective lifelike agents with MPML

  • Invited Paper
  • Published:
New Generation Computing Aims and scope Submit manuscript

Abstract

In this paper, we provide an overview of our research on multimodal media and contents using embodied lifelike agents. In particular we describe our research centered on MPML (Multimodal Presentation Markup Language). MPML allows people to write and produce multimodal contents easily, and serves as a core for integrating various components and functionalities important for multimodal media. To demonstrate the benefits and usability of MPML in a variety of environments including animated Web, 3D VRML space, mobile phones, and the physical world with a humanoid robot, several versions of MPML have been developed while keeping its basic format. Since emotional behavior of the agent is an important factor for making agents lifelike and for being accepted by people as an attractive and friendly human-computer interaction style, emotion-related functions have been emphasized in MPML. In order to alleviate the workload of authoring the contents, it is also required to endow the agents with a certain level of autonomy. We show some of our approaches towards this end.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Allen, J., et al., “Towards Conversational Human-Computer Interaction,”AI Magazine, Vol. 22, No. 4, pp. 27–38, 2001.

    Google Scholar 

  2. A.L.I.C.E. Artificial Intelligence Foundation, URL: http://;www.alicebot.org/.

  3. Arafa, Y., et al., “Two Approaches to Scripting Character Animation,” inProc. AAMAS-02 Workshop on ECA — Let’s Specify and Evaluate Them!, Bologna, Italy.

  4. Badler, N.I., et al., “Parameterized Action Representation for Virtual Human Agents,” inEmbodied Conversational Agents (Cassell, J., et al. (eds.)), pp. 256–284. The MIT Press, 2000.

  5. Ball, G. and Breese, J., “Emotion and Personality in a Conversational Agent,” inEmbodied Conversational Agents (J. Cassell, et al. (eds.)), pp. 189–219, The MIT Press, 2000.

  6. Barakonyi, I. and Ishizuka, M., “A 3D Agent with Synthetic Face and Semiautonomous Behavior for Multimodal Presentations,”Proc. Multimedia Technology and Applications Conference (MTAC’01, IEEE Computer Soc.), pp. 21–25, Irvine, California, USA, 2001.

  7. Becker, C., Prendinger, H., Ishizuka, M. and Wachsmuth, I., “Evaluating Affective Feedback of the 3D Agent Max in a Competitive Cards Game,” inProc. First Int’l Conf. on Affective Computing and Intelligent Interaction (ACII’05) (Tao, J., Tan T. and Picard, R.W. eds.),LNCS 3784, Springer, Beijing, China, pp. 466–473, 2005.

    Chapter  Google Scholar 

  8. Bollegala, D., Okazaki, N. and Ishizuka, M., “A Machine Learning Approach to Sentence Ordering for Multidocument Summarization and its Evaluation,” inProc. of 2nd Int’l Joint Conf. on Natural Language Processing (IJCNLP’05) (Dale, R., Wong, K.-F., Su, J. and Kwong, O.Y. (eds.)),LNAI 3651, Springer, Jeju Island, Korea, pp. 624–635, 2005.

    Google Scholar 

  9. Cassell, J., Sullivan, J., Prevost, S. and Churchill, E. (eds.),Embodied Conversational Agents, The MIT Press, 2000.

  10. Cassell, J., Vilhjalmsson, H. and Bickmore, T., “BEAT: The Behavior Expression Animation Toolkit,” inProc. SIGGRAPH-01, pp. 477–486, 2001.

  11. DeCarolis, B., Carofiglio, V., Bilvi, M. and Pelachaud, C., “APML: a Mark-up Language for Believable Behavior Generation,” inProc. AAMAS’02 Workshop on ECA — Let’s Specify and Evaluate Them!, Bologna, Italy, 2002.

  12. Descamps, S. and Ishizuka, M., “Bringing Affective Behavior to Presentation Agents,” inProc. 3rd Int’l Workshop on Multimedia Network Systems (MNS2001) (IEEE Computer Soc.), pp. 332–336, Mesa, Arizona, 2001.

  13. Descamps, S., Prendinger, H. and Ishizuka, M., “A Multimodal Presentation Mark-up Language for Enhanced Affective Presentation,” inAdvances in Education Technologies: Multimedia. WWW and Distant Education, inProc. Int’l Conf. on Intelligent Multimedia and Distant Learning (ICIMADE’01), pp. 9–16, Fargo, North Dakota, USA, 2001.

  14. Descamps, S., Barakonyi, I. and Ishizuka, M., “Making the Web Emotional: Authoring Multimodal Presentations Using a Synthetic 3D Agent,” inProc. OZCHI’01 (Computer-Human Interaction, SIG of Australia), pp. 25–30, Perth, Australia, 2001.

  15. Ekman, P., Friesen, W.V. and Hager, JC.,The Facial Action Coding System, 2nd ed., Weidenfeld & Nicolson, London, 2002.

    Google Scholar 

  16. Fellbaum, C.,WordNet: An Electronic Lexical Database, The MIT Press, 1982.

  17. Hayes-Roth, B., “What Makes Characters Seem Life-like?,” inLife-Like Characters (Prendinger, H. and Ishizuka, M. (eds.)), pp. 447–462, Springer-Verlag, 2004.

  18. Huang, Z., Eliens, A. and Visser, C., “STEP: a Scripting Language for Embodied Agent,” inProc. PRICAI’02 Workshop on Lifelike Animated Agent — Tools, Affective Functions and Applications, Tokyo, 2002.

  19. Jatowt, A. and Ishizuka, M., “Summarization of Dynamic Content in Web Collections,” inProc. 8th European Conf. on Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD’04), Lecture Notes in Artificial Intelligence, LNAI 3202, pp. 245–254, Springer, Pisa, Italy, (2004)

    Google Scholar 

  20. Kipp, M., “From Human Gesture to Synthetic Action,” inProc. 5th Int’l Conf. on Autonomous Agents, pp. 9–14, Montreal, 2001.

  21. Krenn, B. and Pirker, H., “Defining the Gesticon: Language and Gesture Coordination for Interacting Embodied Agents,” inProc. AISB’04 Symp. on Language, Speech and Gesture for Expressive Characters, pp. 107–115, Univ. of Leeds, UK, 2004.

    Google Scholar 

  22. Kushida, K., Nishimura, Y., Dohi, H., Ishizuka, M., Takeuchi, J. and Tsujino, H., “Humanoid Robot Presentation through Multimodal Presentation Markup Language MPML-HR,”Proc. AAMAS’05 Workshop 13, Creating Bonds with Humanoids, pp. 23–29, Utrecht, The Netherlands, 2005.

  23. Lang, P.J., “The Emotion Probe: Studies of Motivation and Attention,”American Psychologist, Vol. 50, No. 5, pp. 372–385, 1995.

    Article  Google Scholar 

  24. Lester, J., et al., “The Persona Effect: Affective Impact of Animated Pedagogical Agents,” inProc. CHI-97, pp. 359–666, Atlanta, Georgia, 1997.

  25. Liu, H., Lieberman, H. and Selker, T., “A Model of Textual Affect Sensing Using Real-World Knowledge,”Proc. Int’l Conf. on Intelligent User Interfaces (IUI’03), pp. 125–132, Miami, Florida, 2003.

  26. Ma, C., Prendinger, H. and Ishizuka, M., “Eye Movement as an Indicator of Users’ Involvement with Embodied Interfaces at the Low Level,”Proc. Symposium on Conversational Informatics for Supporting Social Intelligence & Interaction: Situational and Environmental Information Enforcing Involvement in Conversation (AISB’05), pp. 136–143, Hatfield, UK, 2005.

  27. Ma, C., Prendinger, H. and Ishizuka, M., “Emotion Estimation and Reasoning Based on Affective Textual Interaction,” inProc. First Int’l Conf. on Affective Computing and Intelligent Interaction, First Int’l Conf. ACII ’05 (J. Tao, T. Tan and R. W. Picard (eds.)),LNCS 3784, Springer, pp. 622–628, Beijing, China, 2005.

  28. Marriott, A. and Stallo, J., “VHML — Uncertainties and Problems: A Discussion,” inProc. AAMAS’02 Workshop on ECA — Let’s Specify and Evaluate Them!, Bologna, Italy, 2002.

  29. Masum, S.M.A., Ishizuka, M. and Islam, Md.T., “‘Auto-Presentation’: A Multi-Agent System for Building Automatic Multi-Modal Presentation of a Topic from World Wide Web Information,”Proc. 2005 IEEE/WIC/ACM Int’l Conf. on Intelligent Agent Technology (WI/IAT2005), pp. 246–249, Compiegne, France, 2005.

  30. McCrae, R.R. and John, O.P., “An Introduction to the Five Factor Model and its Applications,”Journ. of Personality, Vol. 60, pp. 175–215, 1992.

    Article  Google Scholar 

  31. Mehrabian, A.,Nonverval Communication, Aldin-Atherton, Chicago, 1971.

    Google Scholar 

  32. Microsoft:Developing for Microsoft Agent, Microsoft Press, 1998.

  33. MIT Media Lab, “Open Mind Common Sense,” http://commonsense. media. mit. edu/, 2005.

  34. Mori, J., Prendinger, H. and Ishizuka, M., “Evaluation of an Embodied Conversational Agent with Affective Behavior,” inProc. AAMAS’03 Workshop (W10) —Embodied Conversational Characters as Individuals, pp. 58–61, Melbourne, Australia, 2003.

  35. Mori, K., Jatowt, A. and Ishizuka, M., “Enhancing Conversational Flexibility in Multimodal Interactions with Embodied Lifelike Agents,” inProc. Int’l Conf. on Intelligent User Interfaces (IUl’03), pp. 270–272, ACM Press, Miami, Florida, USA, 2003.

    Google Scholar 

  36. Murray, I.R. and Arnott, J.L., “Implementation and Testing of a System for Producing Emotion-by-Rule in Synthetic Speech,”Speech Communication, Vol. 16, pp. 369–390, 1995.

    Article  Google Scholar 

  37. Nakano, Y.I., et al., “Converting Text into Agent Animation: Assigning Gestures to Text,” inProc. Human Language Tech. Conf. of North America Chapter of ACL (HLTNAACL’04), pp. 153–156, 2004.

  38. Nozawa, Y., Dohi, H., Iba, H. and Ishizuka, M., “Humanoid Robot Presentation Controlled by Multimodal Presentation Markup Language MPML,” inProc. 13th IEEE Int’l Workshop on Robot and Human Interactive Communication. (RO-MAN’04), No. 026, Kurashiki, Japan, 2004.

  39. Okazaki, N., Aya, S., Saeyor, S. and Ishizuka, M., “A Multimodal Presentation Markup Language MPML-VR for a 3D Virtual Space,”in Workshop Proc. (CD-ROM) on Virtual Conversational Characters: Applications. Methods, and Research Challenges (in conjunction with HF’02 and OZCHI’02), 4 pages, Melbourne, Australia, 2002.

  40. Okazaki, N, Matsuo, Y. and Ishizuka, M., “TISS: An Integrated Summarization System for TSC-3,” inWorking Notes of the Fourth NTCIR Workshop Meeting (NTCIR-4), pp. 436–443, Tokyo, Japan, 2004.

  41. Okazaki, N., Matsuo, Y. and Ishizuka, M., “Improving Chronological Sentence Ordering by Precedence Relation,” inProc. 20th Int’l Conf. on Computational Linguistics (COLING’04), pp. 750–756, Geneva, Swiss, 2004.

  42. Okazaki, N., Saeyor, S., Dohi, H. and Ishizuka, M., “An Extention of the Multimodal Presentation Markup Language (MPML) to a Tree-dimensional VRML Space,”Systems and Computers in Japan, Vol.36, No. 14, pp. 69–80, Wiley Periodicals Inc., 2005.

    Article  Google Scholar 

  43. Ortony, A., Clore, G.L. and Collins, A.,The Cognitive Structure of Emotions, Cambridge Univ. Press, 1988.

  44. Piwek, P., et al., “RRL: A Rich Representation Language for the Description of Agent Behavior in NECA,” inProc. AAMAS-02 Workshop on ECA-Let’s Specify and Evaluate Them!, Bologna, Italy, 2002.

  45. Prendinger, H. and Ishizuka, M., “Social Role Awareness in Animated Agents,” inProc. 5th International Conf. on Autonomous Agents (Agent’01), pp. 270–277, Montreal, Canada, 2001.

  46. Prendinger, H. and Ishizuka, M., “Agents That Talk Back (Sometimes): Filter Programs for Affective Communication”, inProc. 2nd Workshop on Attitude, Personality and Emotions in User-Adapted Interaction, in conjunction with User Modeling 2001, 6 pages, Sonthofen, Germany, 2001.

  47. Prendinger, H. and Ishizuka, M., “Let’s Talk! Socially Intelligent Agents for Language Conversation Training,”IEEE Trans. on System, Man and cybernetics, Part A, Vol, 31, Issue 5, pp. 465–471, 2001.

    Article  Google Scholar 

  48. Prendinger, H. and Ishizuka, M., “SCREAM: Scripting Emotion-based Agent Minds,” inProc. 1st Int’l Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS-02), pp. 350–351, Bologna, Italy, 2002.

  49. Prendinger, H., Descamps, S. and Ishizuka, M., “Scripting the Bodies and Minds of Life-like Characters,” inPRICAI 2002: Trends in Artificial Intelligence (Proc. 7th Pacific Rim Int’l Conf. on Al, Tokyo) (Ishizuka, M., Sattar, A. (eds.)),LNAI 2417, pp. 571–580, Springer, 2002.

  50. Prendinger, H., Descamps, S. and Ishizuka, M., “Scripting Affective Communication with Life-like Characters in Web-based Interaction Systems,”Applied Artificial Intelligence, Vol. 16, No. 7–8, pp. 519–553, 2002.

    Article  Google Scholar 

  51. Prendinger, H., Mayer, S., Mori J. and Ishizuka, M., “Personal Effect Revisited: Using Bio-signals to Measure and Reflect the Impact of Character-based Interfaces,” inProc. 4th Int’l Working Conf. on Intelligent Virtual Agents (IVA-03) pp. 283–291, Kloster Irsee, Germany, (Springer, Berlin Heidelberg 2003), 2003.

    Google Scholar 

  52. Prendinger, H. and Ishizuka, (eds.),Life-Like Characters—Tools, Affective Functions and Applications, Cognitive Technologies Series, Springer-Verlag, 2004.

  53. Prendinger, H., Mori, J., Saeyor, S., Mori, K., Okazaki, N., Juli, Y., Mayer, S., Dohi, H. and Ishizuka, M., “Scripting and Evaluating Affective Interactions with Embodied Conversational Agents,”KI Zeitschrift (German Journal of Artificial Intelligence), Vol.1, pp. 4–10, 2004.

    Google Scholar 

  54. Prendinger, H., Descamps, S, and Ishizuka, M., “MPML: A Markup Language for Controlling the Behavior of Life-like Characters,”Journal of Visual Languages and Computing, Vol.15, No. 2, pp. 183–203, 2004.

    Article  Google Scholar 

  55. Prendinger, H., Dohi, H., Wang, H., Mayer S. and Ishizuka, M., “Empathic Embodied Interfaces: Addressing Users’ Affective State,” inProc. Tutorial and Research Workshop on Affective Dialogue Systems (ADS’04), pp. 53–64, Kloster Irsee, Germany, 2004.

  56. Prendinger, H., Mori, J. and Ishizuka, M., “Using Human Physiology to Evaluate Subtle Expressivity of a Virtual Quizmaster in a Mathematical Game,”Int’l Journal of Human-Computer Studies, Vol.62, pp. 231–245, 2005.

    Article  Google Scholar 

  57. Prendinger, H. and Ishizuka, M., “The Empathic Companon: A Character-based Interface that Addresses User’s Affective States,”Int’l Journal of Applied Artificial Intelligence, Vol.19, No. 3-4, pp. 267–285, 2005.

    Article  Google Scholar 

  58. Prendinger, H. and Ishizuka, M., “Human Physiology as a Basis for Designing and Evaluating Affective Communication with Life-Like Characters (Invited Paper)”,IEICE Trans. on Information and Systems, (Special Section on Life-like Agent and its Communication), Vol. E88-D, No. 11, pp. 2453–2460, 2005.

    Google Scholar 

  59. Reeves, B. and Nass, C.,Media Equation: How People Treat Computers, Television, and New Media like Real People and Places, Univ. of Chicago Press, 1996.

  60. Ruttkay Z., and Pelachaud C., (eds.),From Brows to Trust—Evaluating Embodied Conversational Agents, Kluwer Academic Pub, 2004.

  61. Saeyor, S., Binda, H. and Ishizuka, M., “Visual Authoring Tool for Presentation Agent Based on Multimodal Presentation Markup Language,” inProc. Information Visualization (IV’01), pp. 563–567, London, England, 2001.

  62. Saeyor, S., Uchiyama, K. and Ishizuka, M., “Multimodal Presentation Markup Language on Mobile Phones,” inProc. AAMAS’03 Workshop (W10)—Embodied Conversational Characters as Individuals, pp. 68–71, Melbourne, Australia, 2003.

  63. Saeyor, S., Mukherjee, S., Uchiyama, K. and Ishizuka, M., “A Scripting Language for Multimodal Presentation on Mobile Phones,” inIntelligent Virtual Agents (Rist, T., Aylett, R., Ballin, D., Rickel, J. (eds.)), (inProc. 4th Int’l Workshop, IVA’03, Kloster Irsee, Germany), 2003.

  64. Shaikh, M., Ishizuka, M. and Islam, Md.T., “‘Auto-Presentation’: A Multi-Agent System for Building Automatic Multi-Modal Presentation of a Topic from World Wide Web Information,” inProc. 2005 IEEE/WIC/ACM Int’l Conf. on Intelligent Agent Technology (WI/IAT’05), pp. 246–249, Compiegne, France, 2005

  65. Smid, K., Pandzic, I.S. and Radman, V., “Automatic Content Production for an Autonomous Speaker Agent,” inProc. AISB’05 Symp. on Conversational Informatics for Supporting Social Intelligence, pp. 103–112, Hatfield, UK, 2005.

  66. Synchronized Multimedia Integration Language, URL: http://www.w3.org/AudioVideo.

  67. Stock, O. and Zancanao, M. (eds.),Multimodal Intelligent Information Presentation, Springer, 2005.

  68. Stone, M., et al., “Speaking with Hand: Creating Automated Animated Conversational Characters from Recordings of Human Performance,”ACM Trans. Graphics (SIGGRAPH),Vol. 23,No. 3, 2004.

  69. Tsutsui, T., Saeyor, S. and Ishizuka, M., “MPML: A Multimodal Presentation Markup Language with Character Agent Control Functions,” inProc. (CD-ROM) WebNet 2000 World Conf. on the WWW and Internet, San Antonio, Texas, USA, 2000.

  70. Valitutti, A., Strapparava, C. and Stock, O., “Developing Affective Lexical Resources,”Psychology Jour., Vol, 2, No. 1, pp. 61–83, 2004. 71.

    Google Scholar 

  71. Zong, Y., Dohi, H. and Ishizuka, M., “Multimodal Presentation Markup Language supporting Emotion Expression,” inProc. (CD-ROM) Workshop on Multimedia Computing on the World Wide Web/ (MCWWW2000), Seattle, 2000.

  72. Picard, R.,Affective Computing, The MIT Press, 2000.

Download references

Author information

Authors and Affiliations

Authors

Additional information

Mitsuru Ishizuka, Ph.D.: He is a professor at the Graduate School of Information Science and Technology, Univ. of Tokyo. Previously, he worked at NTT Yokosuka Laboratory and Institute of Industrial Sceince, Univ. of Tokyo. During 1980–81, he was a visiting assoc. professor at Purdue University. He received his B.S., M.S. and Ph.D. degrees in electronic engineering from the Univ. of Tokyo. His research interests are in the areas of artificial intelligence, multimodal media with lifelike agents, and intelligent WWW information space. He is a member of IEEE, AAAI, Japanese Society for AI (currently, president), IPS Japan, IEICE Japan, etc.

Helmut Prendinger, Ph.D.: He is associate professor at the National Institute of Informatics. Previously, he held positions as a research associate and JSPS post-doctoral fellow at the Univ. of Tokyo. Earlier he worked as a junior specialist at the Univ. of California, Irvine. He received his M.A. and Ph.D. degrees from the Univ. of Salzburg, Dept. of Logic and Philosophy of Science and Dept. of Computer Science. His research interests include artificial intelligence, affective computing, and human-computer interaction, in which areas he has published more than 65 papers in international journals and conferences. He is a co-editor (with Mitsuru Ishizuka) of a book of Life-Like Characters that appeared in the Cognitive Technologies series of Springer.

About this article

Cite this article

Ishizuka, M., Prendinger, H. Describing and generating multimodal contents featuring affective lifelike agents with MPML. New Gener Comput 24, 97–128 (2006). https://doi.org/10.1007/BF03037295

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF03037295

Keywords

Navigation