Skip to main content
Log in

The MUMIN coding scheme for the annotation of feedback, turn management and sequencing phenomena

  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

This paper deals with a multimodal annotation scheme dedicated to the study of gestures in interpersonal communication, with particular regard to the role played by multimodal expressions for feedback, turn management and sequencing. The scheme has been developed under the framework of the MUMIN network and tested on the analysis of multimodal behaviour in short video clips in Swedish, Finnish and Danish. The preliminary results obtained in these studies show that the reliability of the categories defined in the scheme is acceptable, and that the scheme as a whole constitutes a versatile analysis tool for the study of multimodal communication behaviour.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. We follow here Duncan (2004) who defines a gesture as a movement that is always characterised by a stroke, and may also go through a preparation and a retraction phase. Each stroke corresponds in MUMIN to an independent gesture.

References

  • Allwood, J. (2001). Dialog Coding—function and grammar. Gothenburg Papers. Theoretical Linguistics, 85. Department of Linguistics, Gothenburg University.

  • Allwood, J. (2001b). The structure of dialog. In M. Taylor, D. Bouwhuis, & F. Nel (Eds.), The structure of multimodal dialogue II (pp. 3–24). Amsterdam: Benjamins.

    Google Scholar 

  • Allwood, J., & Cerrato, L. (2003). A study of gestural feedback expressions. In P. Paggio et al. (Eds.), Proceedings of the First Nordic Symposium on Multimodal Communication (pp. 7–22).

  • Allwood, J., Nivre, J., & Ahlsén, E. (1992). On the semantics and pragmatics of linguistic feedback. Journal of Semantics, 9, 1–26.

    Article  Google Scholar 

  • Allwood, J., Cerrato, L., Dybkjær, L., Jokinen, K., Navarretta, C., & Paggio, P. (2004). The MUMIN multimodal coding scheme. Technical Report availale at http://www.cst.dk/mumin/. CST, University of Copenhagen, Denmark.

  • Bailly, G., Elisei, F., Badin, P., & Savariaux, C. (2006). Degrees of freedom of facial movements in face-to-face conversational speech. In Proceedings of the LREC 2006 workshop on multimodal corpora (pp. 33–37). Genoa, Italy.

  • Bernsen, N. O., Dybkjær, L., & Kolodnytsky, M. (2002). The NITE workbench—a tool for annotation of natural interactivity and multimodal data. In Proceedings of LREC 2002 (pp. 43–49).

  • Cassell, J. (2000). Nudge nudge wink wink: Elements of face-to-face conversation for embodied conversational agents. In J. Cassell et al. (Eds.), Embodied conversational agents (pp. 1–27). Cambridge, MA: MIT.

    Google Scholar 

  • Cerrato, L. (2007). Investigating communicative feedback phenomena across languages and modalities. PhD Thesis in Speech and Music Communication, Stockholm, KTH.

  • Cerrato, L. (2004). A coding scheme for the annotation of feedback phenomena in conversational speech. In J. C. Martin et al. (Eds.), Proceedings of the LREC 2004 workshop on models of human behaviour (pp. 25–28).

  • Clark, H. H., & Schaefer, E. F. (1989). Contributing to discourse. Cognitive Science, 13, 259–294.

    Article  Google Scholar 

  • Cowie, R. (2000). Describing the emotional states expressed in speech. In Proceedings of the ISCA workshop on speech and emotion (pp. 11–19).

  • Craggs, R., & McGee Wood, M. (2004). A categorical annotation scheme for emotion in the linguistic content of dialogue. In Affective dialogue systems. Proceedings of Tutorial and Research workshop, Kloster Irsee, Germany, June 14–16. Lecture Notes in Computer Science (pp. 89–100). Berlin, Heidelberg: Springer

  • Duncan, S. (2004). Coding manual. Technical Report availale from http://www.mcneilllab.uchicago.edu.

  • Duncan, S. Jr., & Fiske, D.W. (1977). Face-to-face interaction: Research, methods and theory. Lawrence Erlbaum Associates Publishers: Wiley.

  • Ekman, P. (1999). Basic emotions. In T. Dalgleish & M. Power (Eds.), The handbook of cognition and emotion (pp. 45–60). NY: Wiley.

    Chapter  Google Scholar 

  • Ekman, P., & Friesen, W. V. (1978). Facial action coding system. Palo Alto: Consulting Psychologist Press.

    Google Scholar 

  • Ekman, P., & Friesen, W. V. (2003). Unmasking the face: A guide to recognizing emotions from facial cues. Cambridge, Massachusetts: Malor Books.

    Google Scholar 

  • Gunnarsson, M. (2002). User manual for multiTool. Technical Report availale from http://www.ling.gu.se/mgunnar/multitool/MT-manual.pdf.

  • Harrigan, J. A., Rosenthal, R., & Scherer, K. R. (2005). The new handbook of methods in nonverbal behavior research. New York: Oxford University Press.

    Google Scholar 

  • Kendon, A. (2004). Gesture. Cambridge: Cambridge University Press.

    Google Scholar 

  • Kipp, M. (2001). Anvil—A generic annotation tool for multimodal dialogue. In Proceedings of Eurospeech 2001 (pp. 1367–1370).

  • Krippendorff, K. (2004). Content analysis: An introduction to its methodology (2nd ed.). Beverly Hills, CA: Sage Publications.

    Google Scholar 

  • McNeill, D. (1992). Hand and mind: What gestures reveal about thought. Chicago: University of Chicago Press.

    Google Scholar 

  • Peirce, C. S. (1931). In C. Hartshorne & P. Weiss (Eds.), Elements of logic. Collected papers of Charles Sanders Peirce (Vol. 2.). Cambridge: Harvard University Press.

  • Rietveld, T., & van Hout, R. (1993). Statistical techniques for the study of language and language behaviour. Berlin: Mouton de Gruyter.

    Google Scholar 

  • Spooren, W. (2004). On the use of discourse data in language use research. In H. Aertsen, M. Hannay, & R. Lyall (Eds.), Words in their places: A festschrift for J. Lachlan Mackenzie (pp. 381–393). Amsterdam: Faculty of Arts.

  • Steininger, S., Schiel, F., Dioubina, O., & Rabold, S. (2002). Development of user-state conventions for the multimodal corpus in SmartKom. In Proceedings of the workshop ‘Multimodal Resources and Multimodal Systems Evaluation’ 2002 (pp. 33–37). Las Palmas, Gran Canaria, Spain: ELRA.

  • Sikorski, T. (1998). Improving dialogue annotation reliability. In Working notes of the AAAI spring symposium on applying machine learning to discourse processing. March. http://www.cs.rochester.edu/u/sikorski/research/s98aaai.html.

  • Thórisson, K. R. (2002). Natural turn-taking needs no manual: Computational theory and model, from perception to action. In G. Granström, et al. (Eds.), Multimodality in language speech systems (pp. 173–207). Dordrecht, the Netherlands: Kluwer Academic.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrizia Paggio.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Allwood, J., Cerrato, L., Jokinen, K. et al. The MUMIN coding scheme for the annotation of feedback, turn management and sequencing phenomena. Lang Resources & Evaluation 41, 273–287 (2007). https://doi.org/10.1007/s10579-007-9061-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-007-9061-5

Keywords

Navigation