New Generation Computing

, Volume 24, Issue 2, pp 97–128 | Cite as

Describing and generating multimodal contents featuring affective lifelike agents with MPML

  • Mitsuru Ishizuka
  • Helmut Prendinger
Invited Paper

Abstract

In this paper, we provide an overview of our research on multimodal media and contents using embodied lifelike agents. In particular we describe our research centered on MPML (Multimodal Presentation Markup Language). MPML allows people to write and produce multimodal contents easily, and serves as a core for integrating various components and functionalities important for multimodal media. To demonstrate the benefits and usability of MPML in a variety of environments including animated Web, 3D VRML space, mobile phones, and the physical world with a humanoid robot, several versions of MPML have been developed while keeping its basic format. Since emotional behavior of the agent is an important factor for making agents lifelike and for being accepted by people as an attractive and friendly human-computer interaction style, emotion-related functions have been emphasized in MPML. In order to alleviate the workload of authoring the contents, it is also required to endow the agents with a certain level of autonomy. We show some of our approaches towards this end.

Keywords

Lifelike Agent Multimodal Contents Content Description Language Emotion Affective Computing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1).
    Allen, J., et al., “Towards Conversational Human-Computer Interaction,”AI Magazine, Vol. 22, No. 4, pp. 27–38, 2001.Google Scholar
  2. 2).
    A.L.I.C.E. Artificial Intelligence Foundation, URL: http://;www.alicebot.org/.Google Scholar
  3. 3).
    Arafa, Y., et al., “Two Approaches to Scripting Character Animation,” inProc. AAMAS-02 Workshop on ECA — Let’s Specify and Evaluate Them!, Bologna, Italy.Google Scholar
  4. 4).
    Badler, N.I., et al., “Parameterized Action Representation for Virtual Human Agents,” inEmbodied Conversational Agents (Cassell, J., et al. (eds.)), pp. 256–284. The MIT Press, 2000.Google Scholar
  5. 5).
    Ball, G. and Breese, J., “Emotion and Personality in a Conversational Agent,” inEmbodied Conversational Agents (J. Cassell, et al. (eds.)), pp. 189–219, The MIT Press, 2000.Google Scholar
  6. 6).
    Barakonyi, I. and Ishizuka, M., “A 3D Agent with Synthetic Face and Semiautonomous Behavior for Multimodal Presentations,”Proc. Multimedia Technology and Applications Conference (MTAC’01, IEEE Computer Soc.), pp. 21–25, Irvine, California, USA, 2001.Google Scholar
  7. 7).
    Becker, C., Prendinger, H., Ishizuka, M. and Wachsmuth, I., “Evaluating Affective Feedback of the 3D Agent Max in a Competitive Cards Game,” inProc. First Int’l Conf. on Affective Computing and Intelligent Interaction (ACII’05) (Tao, J., Tan T. and Picard, R.W. eds.),LNCS 3784, Springer, Beijing, China, pp. 466–473, 2005.CrossRefGoogle Scholar
  8. 8).
    Bollegala, D., Okazaki, N. and Ishizuka, M., “A Machine Learning Approach to Sentence Ordering for Multidocument Summarization and its Evaluation,” inProc. of 2nd Int’l Joint Conf. on Natural Language Processing (IJCNLP’05) (Dale, R., Wong, K.-F., Su, J. and Kwong, O.Y. (eds.)),LNAI 3651, Springer, Jeju Island, Korea, pp. 624–635, 2005.Google Scholar
  9. 9).
    Cassell, J., Sullivan, J., Prevost, S. and Churchill, E. (eds.),Embodied Conversational Agents, The MIT Press, 2000.Google Scholar
  10. 10).
    Cassell, J., Vilhjalmsson, H. and Bickmore, T., “BEAT: The Behavior Expression Animation Toolkit,” inProc. SIGGRAPH-01, pp. 477–486, 2001.Google Scholar
  11. 11).
    DeCarolis, B., Carofiglio, V., Bilvi, M. and Pelachaud, C., “APML: a Mark-up Language for Believable Behavior Generation,” inProc. AAMAS’02 Workshop on ECA — Let’s Specify and Evaluate Them!, Bologna, Italy, 2002.Google Scholar
  12. 12).
    Descamps, S. and Ishizuka, M., “Bringing Affective Behavior to Presentation Agents,” inProc. 3rd Int’l Workshop on Multimedia Network Systems (MNS2001) (IEEE Computer Soc.), pp. 332–336, Mesa, Arizona, 2001.Google Scholar
  13. 13).
    Descamps, S., Prendinger, H. and Ishizuka, M., “A Multimodal Presentation Mark-up Language for Enhanced Affective Presentation,” inAdvances in Education Technologies: Multimedia. WWW and Distant Education, inProc. Int’l Conf. on Intelligent Multimedia and Distant Learning (ICIMADE’01), pp. 9–16, Fargo, North Dakota, USA, 2001.Google Scholar
  14. 14).
    Descamps, S., Barakonyi, I. and Ishizuka, M., “Making the Web Emotional: Authoring Multimodal Presentations Using a Synthetic 3D Agent,” inProc. OZCHI’01 (Computer-Human Interaction, SIG of Australia), pp. 25–30, Perth, Australia, 2001.Google Scholar
  15. 15).
    Ekman, P., Friesen, W.V. and Hager, JC.,The Facial Action Coding System, 2nd ed., Weidenfeld & Nicolson, London, 2002.Google Scholar
  16. 16).
    Fellbaum, C.,WordNet: An Electronic Lexical Database, The MIT Press, 1982.Google Scholar
  17. 17).
    Hayes-Roth, B., “What Makes Characters Seem Life-like?,” inLife-Like Characters (Prendinger, H. and Ishizuka, M. (eds.)), pp. 447–462, Springer-Verlag, 2004.Google Scholar
  18. 18).
    Huang, Z., Eliens, A. and Visser, C., “STEP: a Scripting Language for Embodied Agent,” inProc. PRICAI’02 Workshop on Lifelike Animated Agent — Tools, Affective Functions and Applications, Tokyo, 2002.Google Scholar
  19. 19).
    Jatowt, A. and Ishizuka, M., “Summarization of Dynamic Content in Web Collections,” inProc. 8th European Conf. on Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD’04), Lecture Notes in Artificial Intelligence, LNAI 3202, pp. 245–254, Springer, Pisa, Italy, (2004)Google Scholar
  20. 20).
    Kipp, M., “From Human Gesture to Synthetic Action,” inProc. 5th Int’l Conf. on Autonomous Agents, pp. 9–14, Montreal, 2001.Google Scholar
  21. 21).
    Krenn, B. and Pirker, H., “Defining the Gesticon: Language and Gesture Coordination for Interacting Embodied Agents,” inProc. AISB’04 Symp. on Language, Speech and Gesture for Expressive Characters, pp. 107–115, Univ. of Leeds, UK, 2004.Google Scholar
  22. 22).
    Kushida, K., Nishimura, Y., Dohi, H., Ishizuka, M., Takeuchi, J. and Tsujino, H., “Humanoid Robot Presentation through Multimodal Presentation Markup Language MPML-HR,”Proc. AAMAS’05 Workshop 13, Creating Bonds with Humanoids, pp. 23–29, Utrecht, The Netherlands, 2005.Google Scholar
  23. 23).
    Lang, P.J., “The Emotion Probe: Studies of Motivation and Attention,”American Psychologist, Vol. 50, No. 5, pp. 372–385, 1995.CrossRefGoogle Scholar
  24. 24).
    Lester, J., et al., “The Persona Effect: Affective Impact of Animated Pedagogical Agents,” inProc. CHI-97, pp. 359–666, Atlanta, Georgia, 1997.Google Scholar
  25. 25).
    Liu, H., Lieberman, H. and Selker, T., “A Model of Textual Affect Sensing Using Real-World Knowledge,”Proc. Int’l Conf. on Intelligent User Interfaces (IUI’03), pp. 125–132, Miami, Florida, 2003.Google Scholar
  26. 26).
    Ma, C., Prendinger, H. and Ishizuka, M., “Eye Movement as an Indicator of Users’ Involvement with Embodied Interfaces at the Low Level,”Proc. Symposium on Conversational Informatics for Supporting Social Intelligence & Interaction: Situational and Environmental Information Enforcing Involvement in Conversation (AISB’05), pp. 136–143, Hatfield, UK, 2005.Google Scholar
  27. 27).
    Ma, C., Prendinger, H. and Ishizuka, M., “Emotion Estimation and Reasoning Based on Affective Textual Interaction,” inProc. First Int’l Conf. on Affective Computing and Intelligent Interaction, First Int’l Conf. ACII ’05 (J. Tao, T. Tan and R. W. Picard (eds.)),LNCS 3784, Springer, pp. 622–628, Beijing, China, 2005.Google Scholar
  28. 28).
    Marriott, A. and Stallo, J., “VHML — Uncertainties and Problems: A Discussion,” inProc. AAMAS’02 Workshop on ECA — Let’s Specify and Evaluate Them!, Bologna, Italy, 2002.Google Scholar
  29. 29).
    Masum, S.M.A., Ishizuka, M. and Islam, Md.T., “‘Auto-Presentation’: A Multi-Agent System for Building Automatic Multi-Modal Presentation of a Topic from World Wide Web Information,”Proc. 2005 IEEE/WIC/ACM Int’l Conf. on Intelligent Agent Technology (WI/IAT2005), pp. 246–249, Compiegne, France, 2005.Google Scholar
  30. 30).
    McCrae, R.R. and John, O.P., “An Introduction to the Five Factor Model and its Applications,”Journ. of Personality, Vol. 60, pp. 175–215, 1992.CrossRefGoogle Scholar
  31. 31).
    Mehrabian, A.,Nonverval Communication, Aldin-Atherton, Chicago, 1971.Google Scholar
  32. 32).
    Microsoft:Developing for Microsoft Agent, Microsoft Press, 1998.Google Scholar
  33. 33).
    MIT Media Lab, “Open Mind Common Sense,” http://commonsense. media. mit. edu/, 2005.Google Scholar
  34. 34).
    Mori, J., Prendinger, H. and Ishizuka, M., “Evaluation of an Embodied Conversational Agent with Affective Behavior,” inProc. AAMAS’03 Workshop (W10) —Embodied Conversational Characters as Individuals, pp. 58–61, Melbourne, Australia, 2003.Google Scholar
  35. 35).
    Mori, K., Jatowt, A. and Ishizuka, M., “Enhancing Conversational Flexibility in Multimodal Interactions with Embodied Lifelike Agents,” inProc. Int’l Conf. on Intelligent User Interfaces (IUl’03), pp. 270–272, ACM Press, Miami, Florida, USA, 2003.Google Scholar
  36. 36).
    Murray, I.R. and Arnott, J.L., “Implementation and Testing of a System for Producing Emotion-by-Rule in Synthetic Speech,”Speech Communication, Vol. 16, pp. 369–390, 1995.CrossRefGoogle Scholar
  37. 37).
    Nakano, Y.I., et al., “Converting Text into Agent Animation: Assigning Gestures to Text,” inProc. Human Language Tech. Conf. of North America Chapter of ACL (HLTNAACL’04), pp. 153–156, 2004.Google Scholar
  38. 38).
    Nozawa, Y., Dohi, H., Iba, H. and Ishizuka, M., “Humanoid Robot Presentation Controlled by Multimodal Presentation Markup Language MPML,” inProc. 13th IEEE Int’l Workshop on Robot and Human Interactive Communication. (RO-MAN’04), No. 026, Kurashiki, Japan, 2004.Google Scholar
  39. 39).
    Okazaki, N., Aya, S., Saeyor, S. and Ishizuka, M., “A Multimodal Presentation Markup Language MPML-VR for a 3D Virtual Space,”in Workshop Proc. (CD-ROM) on Virtual Conversational Characters: Applications. Methods, and Research Challenges (in conjunction with HF’02 and OZCHI’02), 4 pages, Melbourne, Australia, 2002.Google Scholar
  40. 40).
    Okazaki, N, Matsuo, Y. and Ishizuka, M., “TISS: An Integrated Summarization System for TSC-3,” inWorking Notes of the Fourth NTCIR Workshop Meeting (NTCIR-4), pp. 436–443, Tokyo, Japan, 2004.Google Scholar
  41. 41).
    Okazaki, N., Matsuo, Y. and Ishizuka, M., “Improving Chronological Sentence Ordering by Precedence Relation,” inProc. 20th Int’l Conf. on Computational Linguistics (COLING’04), pp. 750–756, Geneva, Swiss, 2004.Google Scholar
  42. 42).
    Okazaki, N., Saeyor, S., Dohi, H. and Ishizuka, M., “An Extention of the Multimodal Presentation Markup Language (MPML) to a Tree-dimensional VRML Space,”Systems and Computers in Japan, Vol.36, No. 14, pp. 69–80, Wiley Periodicals Inc., 2005.CrossRefGoogle Scholar
  43. 43).
    Ortony, A., Clore, G.L. and Collins, A.,The Cognitive Structure of Emotions, Cambridge Univ. Press, 1988.Google Scholar
  44. 44).
    Piwek, P., et al., “RRL: A Rich Representation Language for the Description of Agent Behavior in NECA,” inProc. AAMAS-02 Workshop on ECA-Let’s Specify and Evaluate Them!, Bologna, Italy, 2002.Google Scholar
  45. 45).
    Prendinger, H. and Ishizuka, M., “Social Role Awareness in Animated Agents,” inProc. 5th International Conf. on Autonomous Agents (Agent’01), pp. 270–277, Montreal, Canada, 2001.Google Scholar
  46. 46).
    Prendinger, H. and Ishizuka, M., “Agents That Talk Back (Sometimes): Filter Programs for Affective Communication”, inProc. 2nd Workshop on Attitude, Personality and Emotions in User-Adapted Interaction, in conjunction with User Modeling 2001, 6 pages, Sonthofen, Germany, 2001.Google Scholar
  47. 47).
    Prendinger, H. and Ishizuka, M., “Let’s Talk! Socially Intelligent Agents for Language Conversation Training,”IEEE Trans. on System, Man and cybernetics, Part A, Vol, 31, Issue 5, pp. 465–471, 2001.CrossRefGoogle Scholar
  48. 48).
    Prendinger, H. and Ishizuka, M., “SCREAM: Scripting Emotion-based Agent Minds,” inProc. 1st Int’l Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS-02), pp. 350–351, Bologna, Italy, 2002.Google Scholar
  49. 49).
    Prendinger, H., Descamps, S. and Ishizuka, M., “Scripting the Bodies and Minds of Life-like Characters,” inPRICAI 2002: Trends in Artificial Intelligence (Proc. 7th Pacific Rim Int’l Conf. on Al, Tokyo) (Ishizuka, M., Sattar, A. (eds.)),LNAI 2417, pp. 571–580, Springer, 2002.Google Scholar
  50. 50).
    Prendinger, H., Descamps, S. and Ishizuka, M., “Scripting Affective Communication with Life-like Characters in Web-based Interaction Systems,”Applied Artificial Intelligence, Vol. 16, No. 7–8, pp. 519–553, 2002.CrossRefGoogle Scholar
  51. 51).
    Prendinger, H., Mayer, S., Mori J. and Ishizuka, M., “Personal Effect Revisited: Using Bio-signals to Measure and Reflect the Impact of Character-based Interfaces,” inProc. 4th Int’l Working Conf. on Intelligent Virtual Agents (IVA-03) pp. 283–291, Kloster Irsee, Germany, (Springer, Berlin Heidelberg 2003), 2003.Google Scholar
  52. 52).
    Prendinger, H. and Ishizuka, (eds.),Life-Like Characters—Tools, Affective Functions and Applications, Cognitive Technologies Series, Springer-Verlag, 2004.Google Scholar
  53. 53).
    Prendinger, H., Mori, J., Saeyor, S., Mori, K., Okazaki, N., Juli, Y., Mayer, S., Dohi, H. and Ishizuka, M., “Scripting and Evaluating Affective Interactions with Embodied Conversational Agents,”KI Zeitschrift (German Journal of Artificial Intelligence), Vol.1, pp. 4–10, 2004.Google Scholar
  54. 54).
    Prendinger, H., Descamps, S, and Ishizuka, M., “MPML: A Markup Language for Controlling the Behavior of Life-like Characters,”Journal of Visual Languages and Computing, Vol.15, No. 2, pp. 183–203, 2004.CrossRefGoogle Scholar
  55. 55).
    Prendinger, H., Dohi, H., Wang, H., Mayer S. and Ishizuka, M., “Empathic Embodied Interfaces: Addressing Users’ Affective State,” inProc. Tutorial and Research Workshop on Affective Dialogue Systems (ADS’04), pp. 53–64, Kloster Irsee, Germany, 2004.Google Scholar
  56. 56).
    Prendinger, H., Mori, J. and Ishizuka, M., “Using Human Physiology to Evaluate Subtle Expressivity of a Virtual Quizmaster in a Mathematical Game,”Int’l Journal of Human-Computer Studies, Vol.62, pp. 231–245, 2005.CrossRefGoogle Scholar
  57. 57).
    Prendinger, H. and Ishizuka, M., “The Empathic Companon: A Character-based Interface that Addresses User’s Affective States,”Int’l Journal of Applied Artificial Intelligence, Vol.19, No. 3-4, pp. 267–285, 2005.CrossRefGoogle Scholar
  58. 58).
    Prendinger, H. and Ishizuka, M., “Human Physiology as a Basis for Designing and Evaluating Affective Communication with Life-Like Characters (Invited Paper)”,IEICE Trans. on Information and Systems, (Special Section on Life-like Agent and its Communication), Vol. E88-D, No. 11, pp. 2453–2460, 2005.Google Scholar
  59. 59).
    Reeves, B. and Nass, C.,Media Equation: How People Treat Computers, Television, and New Media like Real People and Places, Univ. of Chicago Press, 1996.Google Scholar
  60. 60).
    Ruttkay Z., and Pelachaud C., (eds.),From Brows to Trust—Evaluating Embodied Conversational Agents, Kluwer Academic Pub, 2004.Google Scholar
  61. 61).
    Saeyor, S., Binda, H. and Ishizuka, M., “Visual Authoring Tool for Presentation Agent Based on Multimodal Presentation Markup Language,” inProc. Information Visualization (IV’01), pp. 563–567, London, England, 2001.Google Scholar
  62. 62).
    Saeyor, S., Uchiyama, K. and Ishizuka, M., “Multimodal Presentation Markup Language on Mobile Phones,” inProc. AAMAS’03 Workshop (W10)—Embodied Conversational Characters as Individuals, pp. 68–71, Melbourne, Australia, 2003.Google Scholar
  63. 63).
    Saeyor, S., Mukherjee, S., Uchiyama, K. and Ishizuka, M., “A Scripting Language for Multimodal Presentation on Mobile Phones,” inIntelligent Virtual Agents (Rist, T., Aylett, R., Ballin, D., Rickel, J. (eds.)), (inProc. 4th Int’l Workshop, IVA’03, Kloster Irsee, Germany), 2003.Google Scholar
  64. 64).
    Shaikh, M., Ishizuka, M. and Islam, Md.T., “‘Auto-Presentation’: A Multi-Agent System for Building Automatic Multi-Modal Presentation of a Topic from World Wide Web Information,” inProc. 2005 IEEE/WIC/ACM Int’l Conf. on Intelligent Agent Technology (WI/IAT’05), pp. 246–249, Compiegne, France, 2005Google Scholar
  65. 65).
    Smid, K., Pandzic, I.S. and Radman, V., “Automatic Content Production for an Autonomous Speaker Agent,” inProc. AISB’05 Symp. on Conversational Informatics for Supporting Social Intelligence, pp. 103–112, Hatfield, UK, 2005.Google Scholar
  66. 66).
    Synchronized Multimedia Integration Language, URL: http://www.w3.org/AudioVideo.Google Scholar
  67. 67).
    Stock, O. and Zancanao, M. (eds.),Multimodal Intelligent Information Presentation, Springer, 2005.Google Scholar
  68. 68).
    Stone, M., et al., “Speaking with Hand: Creating Automated Animated Conversational Characters from Recordings of Human Performance,”ACM Trans. Graphics (SIGGRAPH),Vol. 23,No. 3, 2004.Google Scholar
  69. 69).
    Tsutsui, T., Saeyor, S. and Ishizuka, M., “MPML: A Multimodal Presentation Markup Language with Character Agent Control Functions,” inProc. (CD-ROM) WebNet 2000 World Conf. on the WWW and Internet, San Antonio, Texas, USA, 2000.Google Scholar
  70. 70).
    Valitutti, A., Strapparava, C. and Stock, O., “Developing Affective Lexical Resources,”Psychology Jour., Vol, 2, No. 1, pp. 61–83, 2004. 71.Google Scholar
  71. 71).
    Zong, Y., Dohi, H. and Ishizuka, M., “Multimodal Presentation Markup Language supporting Emotion Expression,” inProc. (CD-ROM) Workshop on Multimedia Computing on the World Wide Web/ (MCWWW2000), Seattle, 2000.Google Scholar
  72. 72).
    Picard, R.,Affective Computing, The MIT Press, 2000.Google Scholar

Copyright information

© Ohmsha, Ltd. and Springer 2006

Authors and Affiliations

  • Mitsuru Ishizuka
    • 1
  • Helmut Prendinger
    • 2
  1. 1.School of Information Science and TechnologyThe University of TokyoJapan
  2. 2.National Institute of InformaticsJapan

Personalised recommendations