Language Resources and Evaluation

, Volume 42, Issue 2, pp 253–264 | Cite as

Introduction to the special issue on multimodal corpora for modeling human multimodal behavior

  • Jean-Claude MartinEmail author
  • Patrizia Paggio
  • Peter Kuehnlein
  • Rainer Stiefelhagen
  • Fabio Pianesi

Why a special issue on multimodal corpora?

There is an increasing interest in multimodal communication as suggested by several national and international projects (ISLE, HUMAINE, SIMILAR, CHIL, AMI, CALO, VACE, CALLAS), the attention devoted to the topic by well-known institutions and organizations (the National Institute of Standards and Technology, the Linguistic Data Consortium), and the success of conferences related to multimodal communication (ICMI, IVA, Gesture, Measuring Behavior, Nordic Symposium on Multimodal Communication, LREC Workshops on Multimodal Corpora).

As Dutoit et al. (2006) lament, however, « there is a lack of multimodal corpora suitable for the evaluation of recognition/synthesis approaches and interaction strategies … one must admit that most corpora available today target the study of a limited number of modalities, if not one ». Corpora are not only relevant to evaluation purposes, their importance extending to all the stages of design and development of...


Facial Expression Annotation Scheme Virtual Character Multimodal Interface Virtual Agent 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



We want to thank the many reviewers who agreed to review manuscripts for this special issue on multimodal corpora. We are very grateful for the hard work they all put into it.

Irene Albrecht

Jens Allwood

Elisabeth André

Gerard Bailly

Alexandre Benoit

Jonas Beskow

Annelies Braffort

Tom Brondsted

Stéphanie Buisine

Susanne Burger

Geneviève Calbris

Loredana Cerrato

Lei Chen

Christopher Cieri

Alice Caplier

Patrice Dalle

Thierry Declerck

Susan Duncan

Michael Dyer E. Eide

Vincent Girondel

Dirk Heylen

Michael Johnston

Ed Kaiser

Kostas Karpouzis

Michael Kipp

Stefan Kopp

Emiel Krahmer

Peter Kühnlein

Daniel Loehr

Wolfgang Minker

Djamel Mostefa

Anton Nijholt

Jean-Marc Odobez

Catherine Pelachaud

Paul Piwek

Isabella Poggi

Andrei Popescu-Belis

Gerasimos Potamianos

Matthew Purver

Jean-Hugues Réty

Florian Schiel

Nicu Sebe

Ielka van der Sluis

Kita Sotaro

Matthew Stone

Paul Tepper

Kris Thorisson

J. Wiebe

Jie Yang

Dong Zhang


  1. Almeida, L., Amdal, I., Beires, N., Boualem, M., Boves, L., Os, E., et al. (2002). The MUST Guide to Paris; Implementation and expert evaluation of a multimodal tourist guide to Paris. Multi-modal dialogue in mobile environments, ISCA Tutorial and Research Workshop (IDS’2002), Kloster Irsee, Germany.
  2. André, E. (2006). Corpus-based approaches to behavior modeling for virtual humans: A critical review, Modeling communication with robots and virtual humans. Workshop of the ZiF: Research Group 2005/2006 “Embodied communication in humans and machines”. Scientific Organization: Ipke Wachsmuth (Bielefeld), Günther Knoblich (Newark).Google Scholar
  3. Argyle, M. (2004). Bodily communication (2nd ed.). London and New York: Routledge, Taylor & Francis.Google Scholar
  4. Bernsen, N. O., & Dybkjær, L. (2004). Evaluation of spoken multimodal conversation. In Sixth International Conference on Multimodal Interaction (ICMI’2004). New York: Association for Computing Machinery (ACM).Google Scholar
  5. Beun, R.-J., & Cremers, A. (2001). Multimodal reference to objects: An empirical approach. In Proceedings of Cooperative Multimodal Communication: Second International Conference (CMC’98). Revised Papers, Tilburg, The Netherlands: Springer-Verlag GmbH.Google Scholar
  6. Buisine, S. (2005). Conception et Évaluation d’Agents Conversationnels Multimodaux Bidirectionnels., Doctorat de Psychologie Cognitive-Ergonomie, Paris V. 8 avril 2005. Direction J.-C. Martin & J.-C. Sperandio.
  7. Butterworth, B., & Beattie, G. (1978). Gesture and silence as indicators of planning in speech. In R. N. Campbell & P. Smith (Eds.), Recent advances in the psychology of language: Formal and experimental approaches (pp. 347–360). New York: Plenum.Google Scholar
  8. Cassell, J., Bickmore, T., Campbell, L., Vilhjálmsson, H., & Yan, H. (2000). Human conversation as a system framework: Designing embodied conversational agents. In J. Cassell, S. Prevost, & E. Churchill (Eds.), Embodied conversational Agents (pp. 29–63). Cambridge, MA: MIT.Google Scholar
  9. Cassell, J., Nakano, Y. I., Bickmore, T. W., Sidner, C. L., & Rich, C. (2001). Annotating and generating posture from discourse structure in embodied conversational agents. In Workshop “Multimodal communication and context in embodied agents”, 5th International Conference on Autonomous Agents, Montreal.Google Scholar
  10. Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Becket, T., et al. (1994). Animated conversation: Rule-based generation of facial expression, gesture and spoken intonation for multiple conversational agents. ACM SIGGRAPH’94.
  11. Cassell, J., Torres, O., & Prevost, S. (1999). Turn taking vs. discourse structure: How best to model multimodal conversation. In Y. Wilks (Ed.), Machine conversations (pp. 143–154). The Hague: Kluwer.Google Scholar
  12. Chen, L., Travis Rose, R. T., Qiao, Y., Kimbara, I., Parrill, F., Welji, H., et al. (2006). Vace multimodal meeting corpus. In Second International Workshop on Machine Learning for Multimodal Interaction-MLMI. Lecture Notes in Computer Science. Berlin: Springer.Google Scholar
  13. Cohn, J. F., & Ekman, P. (2005). Measuring facial action. In J. A. Harrigan, R. Rosenthal, & K. Scherer (Eds.), The new handbook of methods in nonverbal behavior research. Oxford University Press.Google Scholar
  14. Collier, G. (1985). Emotional expression. Lawrence Erlbaum Associates.
  15. Dutoit, T., Nigay, L., & Schnaider, M. (2006). In T. Dutoit, L. Nigay, & M. Schnaider (Eds.), Multimodal human–computer interfaces. Elsevier. Journal of Signal Processing. Special Issue on “Multimodal Human–computer Interfaces”, 86(12), 3515–3517.
  16. Ekman, P. (1999). Basic emotions. In T. Dalgleish & M. J. Power (Eds.), Handbook of cognition & emotion (pp. 301–320). New York: Wiley.CrossRefGoogle Scholar
  17. Ekman, P. (2003). Emotions revealed. Understanding faces and feelings., Weidenfeld & Nicolson.
  18. Ekman, P., & Friesen, W. V. (1975). Unmasking the face. A guide to recognizing emotions from facial clues. Englewood Cliffs, NJ: Prentice-Hall Inc.Google Scholar
  19. Ekman, P., Friesen, W. C., & Hager, J. C. (2002). Facial action coding system. The manual on CD ROM. Research Nexus division of Network Information Research Corporation.Google Scholar
  20. Feldman, R. S., & Rim, B. (1991). Fundamentals of nonverbal behavior. Cambridge University Press.Google Scholar
  21. Garofolo, J., Laprum, C., Michel, M., Stanford, V., & Tabassi, E. (2004). The NIST Meeting Room Pilot Corpus. Language Resource and Evaluation Conference.Google Scholar
  22. Goldin-Meadow, S., Kim, S., & Singer, M. (1999). What the teacher’s hand tell the student mind about math. Journal of Educational Psychology, 91, 720–730. doi: 10.1037/0022-0663.91.4.720.CrossRefGoogle Scholar
  23. Harrigan, J. A., Rosenthal, R., & Scherer, K. (2005). The new handbook of methods in nonverbal behavior research. Oxford University Press.Google Scholar
  24. Holzapfel, H., Nickel, K.,& Stiefelhagen, R. (2004). Implementation and evaluation of a constraint-based multimodal fusion system for speech and 3D pointing gestures. ICMI 2004.
  25. Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., et al. (2003). IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).Google Scholar
  26. Johnston, O., & Thomas, F. (1995). The illusion of life: Disney animation, Disney Editions.Google Scholar
  27. Kendon, A. (2004). Gesture: Visible action as utterance. Cambridge University Press.Google Scholar
  28. Kettebekov, S., Yeasin, M., Krahnstoever, N., & Sharma, R. (2002). Prosody based co-analysis of deictic gestures and speech in Weather Narration Broadcast. Workshop on multimodal resources and multimodal systems evaluation. In Conference on Language Resources and Evaluation (LREC’2002), Las Palmas, Canary Islands, Spain.Google Scholar
  29. Kipp, M. (2004). Gesture generation by imitation. From human behavior to computer character animation. Florida, Boca Raton,
  30. Kita, S. (2003). Interplay of gaze, hand, torso orientation, and language in pointing. In S. Kita (Ed.), Pointing. Where language, culture, and cognition meet (pp. 307–328). London: Lawrence Erlbaum Associates.Google Scholar
  31. Knapp, M. L., & Hall, J. A. (2006). Nonverbal communication in human interaction (6th ed.). Thomson Wadsworth.Google Scholar
  32. Kranstedt, A., Kühnlein, P., & Wachsmuth, I. (2004). Deixis in multimodal human–computer interaction. In A. Camurri & G. Volpe (Eds.), Gesture-based communication in human–computer interaction. 5th International Gesture Workshop, GW 2003, Genova, Italy. Springer. LNAI 2915.Google Scholar
  33. Krauss, R. M. (1998). Why do we gesture when we speak? Current Directions in Psychological Science, 7, 54–59. doi: 10.1111/1467-8721.ep13175642.
  34. Kress, G., Jewitt, C., Ogborn, J., & Tsatsarelis, C. (2001). Multimodal teaching and learning. The rhetorics of the science classroom. Continuum.Google Scholar
  35. Loehr, D. (2004). Gesture and intonation. Faculty of the Graduate School of Arts and Sciences of Georgetown University.
  36. Martin, J. C. (2006). Multimodal human–computer interfaces and individual differences. Annotation, perception, representation and generation of situated multimodal behaviors. Habilitation à diriger des recherches en Informatique. Université Paris XI, 6th December 2006.Google Scholar
  37. Martin, J.-C., den Os, E., Kuhnlein, P., Boves, L., Paggio, P., & Catizone, R. (2004). Workshop on multimodal corpora: models of human behaviour for the specification and evaluation of multimodal input and output interfaces. In Association with the 4th International Conference on Language Resources and Evaluation LREC2004. Lisbon, Portugal: Centro Cultural de Belem.
  38. Martin, J.-C., Kuhnlein, P., Paggio, P., Stiefelhagen, R., & Pianesi, F. (2006). Workshop on multimodal corpora: From multimodal behaviour theories to usable models. In Association with the 5th International Conference on Language Resources and Evaluation (LREC2006), Genoa, Italy.
  39. Maybury, M., & Martin, J.-C. (2002). Workshop on multimodal resources and multimodal systems evaluation. In Conference on Language Resources and Evaluation (LREC’2002), Las Palmas, Canary Islands, Spain.
  40. McCowan, I., Carletta, J., Kraaij, W., Ashby, S., Bourban, S., Flynn, M., et al. (2005). The AMI meeting corpus. In Measuring Behavior 2005 Symposium on “Annotating and Measuring Meeting Behavior”.Google Scholar
  41. McNeill, D. (1992). Hand and mind —what gestures reveal about thoughts. Chicago, IL: University of Chicago Press.Google Scholar
  42. McNeill, D. (2005). Gesture and thought. The University of Chicago Press.Google Scholar
  43. McNeill, D., Quek, F., McCullough, K.-E., Duncan, S., Furuyama, N., Bryll, R., et al. (2001). Catchments, prosody and discourse. Gesture, 1(1), 9–33. doi: 10.1075/gest.1.1.03mcn.CrossRefGoogle Scholar
  44. Oviatt, S. L. (2003). Multimodal interfaces. In J. Jacko & A. Sears (Eds.), Human–computer interaction handbook: Fundamentals, evolving technologies and emerging applications (Vol. 14, pp. 286–304). Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar
  45. Pelachaud, C., Braffort, A., Breton, G., Ech Chadai, N., Gibet, S., Martin, J.-C., et al. (2004). AGENTS CONVERSATIONELS: Systèmes d’animation Modélisation des comportements multimodaux Applications: agents pédagogiques et agents signeurs. Action Spécifique du CNRS Humain Virtuel. (Eds.).Google Scholar
  46. Pentland, A. (2005). Socially aware computation and communication. IEEE Computer.Google Scholar
  47. Piwek, P., & Beun, R. J. (2001). Multimodal referential acts in a dialogue game: From empirical investigations to algorithms. In International Workshop on Information Presentation and Natural Multimodal Dialogue (IPNMD-2001), Verona, Italy.
  48. Poggi, I. (1996). Mind markers. In 5th International Pragmatics Conference, Mexico City.Google Scholar
  49. Poggi, I. (2003). Mind markers. In M. Rector, I. Poggi, & N. Trigo (Eds.), Gestures. Meaning and use (pp. 119–132). Oporto, Portugal: University Fernando Pessoa Press.Google Scholar
  50. Rist, T., André, E., Baldes, S., Gebhard, P., Klesen, M., Kipp, M., et al. (2003). A review of the development of embodied presentation agents and their application fields. In H. Prendinger & M. Ishizuka (Eds.), Life-like characters: Tools, affective functions, and applications (pp. 377–404). Springer.Google Scholar
  51. Ruttkay, Z., & Pelachaud, C. (2004). From brows to trust—evaluating embodied conversational agents. Kluwer.
  52. Siegman, A. W., & Feldstein, S. (1985). Multichannel integrations of nonverbal behavior, LEA.Google Scholar
  53. Tepper, P., Kopp, S., & Cassell, J. (2004). Content in context: Generating language and iconic gesture without a gestionary. In Workshop on Balanced Perception and Action in ECAs at Automous Agents and Multiagent Systems (AAMAS), New York, NY.Google Scholar
  54. van der Sluis, L., & Krahmer, E. (2004). Production experiments for evaluating multimodal generation. In 4th International Conference on Language Resources and Evaluation (LREC’2004).Google Scholar
  55. Vinayagamoorthy, V., Gillies, M., Steed, A., Tanguy, E., Pan, X., Loscos, C., et al. (2006). Building expression into virtual characters. In Eurographics Conference State of the Art Reports.
  56. Wahlster, W. (2006). SmartKom: Foundations of multimodal dialogue systems. Heidelberg, Germany: Springer.Google Scholar
  57. Wegener Knudsen, M., Martin, J.-C., Dybkjær, L., Berman, S., Bernsen, N. O., Choukri, K., et al. (2002a). Survey of NIMM data resources, current and future user profiles, markets and user needs for NIMM resources. ISLE Natural Interactivity and Multimodality. Working Group Deliverable D8.1.
  58. Wegener Knudsen, M., Martin, J.-C., Dybkjær, L., Machuca Ayuso, M.-J., Bernsen, N. O., Carletta, J., et al. (2002b). Survey of multimodal annotation schemes and best practice. ISLE Natural Interactivity and Multimodality. Working Group Deliverable D9.1. February.

Copyright information

© Springer Science+Business Media B.V. 2008

Authors and Affiliations

  • Jean-Claude Martin
    • 1
    Email author
  • Patrizia Paggio
    • 2
  • Peter Kuehnlein
    • 3
  • Rainer Stiefelhagen
    • 4
  • Fabio Pianesi
    • 5
  1. 1.CNRS-LIMSIOrsayFrance
  2. 2.University of CopenhagenCopenhagenDenmark
  3. 3.CLCG University of GroningenGroningenThe Netherlands
  4. 4.University of Karlsruhe (TH)KarlsruheGermany
  5. 5.FBK-irst—Fondazione Bruno KesslerTrentoItaly

Personalised recommendations