The Danish NOMCO corpus: multimodal interaction in first acquaintance conversations

Paggio, Patrizia; Navarretta, Costanza

doi:10.1007/s10579-016-9371-6

The Danish NOMCO corpus: multimodal interaction in first acquaintance conversations

Original Paper
Published: 19 October 2016

Volume 51, pages 463–494, (2017)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

Patrizia Paggio^1,2 &
Costanza Navarretta¹

713 Accesses
13 Citations
Explore all metrics

Abstract

This article presents the Danish NOMCO Corpus, an annotated multimodal collection of video-recorded first acquaintance conversations between Danish speakers. The annotation includes speech transcription including word boundaries, and formal as well as functional coding of gestural behaviours, specifically head movements, facial expressions, and body posture. The corpus has served as the empirical basis for a number of studies of communication phenomena related to turn management, feedback exchange, information packaging and the expression of emotional attitudes. We describe the annotation scheme, procedure, and annotation results. We then summarise a number of studies conducted on the corpus. The corpus is available for research and teaching purposes through the authors of this article.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The ALICO corpus: analysing the active listener

Article 21 May 2016

Zofia Malisz, Marcin Włodarczak, … Petra Wagner

Multimodal Behaviours in Comparable Danish and Polish Human-Human Triadic Spontaneous Interactions

The Corpus of Interactional Data: A Large Multimodal Annotated Resource

Notes

We relied on the definition of utterance proposed in Levinson (1983), where an utterance is defined as “the issuance of a sentence, a sentence-analogue, or sentence-fragment, in an actual context” (p. 18).
A step in this direction was taken by developing a face and head tracker ANVIL plugin-in (Jongejan 2010) which can be used to further annotate the corpus.
In most cases one coder chose one category as the primary and indicated another possible category in the comment field, while the second coder chose the second category as the primary and mentioned the first one in the comment field.
Unimodal here is intended in the sense of a gesture not accompanied by a word. We do not investigate whether the nod occurs together with other gestural behaviours.

References

Alahverdzhieva, K., Lascarides, A. (2010). Analysing speech and co-speech gesture in constraint-based grammars. In S. Müller (Ed.), Proceedings of the HPSG10 conference (pp. 6–26). Stanford: CSLI Publications.
Allwood, J. (2002). Bodily communication dimensions of expression and content. In B. Granström, D. House, & I. Karlsson (Eds.), Multimodality in language and speech systems (pp. 7–26). Dordrecht: Springer. doi: 10.1007/978-94-017-2367-1_2.
Allwood, J. (2008). Dimensions of embodied communication—Towards a typology of embodied communication. In I Wachsmuth, M. Lenzen & G. Knoblich (Eds.), Embodied communication in humans and machines. Oxford: Oxford University Press.
Allwood, J., Cerrato, L., Jokinen, K., Navarretta, C., & Paggio, P. (2007). The MUMIN coding scheme for the annotation of feedback, turn management and sequencing phenomena. In Martin JC, Paggio P, Kuehnlein P, Stiefelhagen R, Pianesi F (Eds.), Multimodal corpora for modelling human multimodal behaviour, special issue of the international journal of language resources and evaluation (Vol. 41, pp. 273–287). Berlin: Springer.
Allwood, J., Lanzini, S., & Ahlsén, E. (2014). Contributions of different modalities to the attribution of affective-epistemic states. In P. Paggio & B. N. Wessel-Tolvig (Eds.), Proceedings from the 1st European symposium on multimodal communication University of Malta (pp. 1–6). Valletta: Linköping University Electronic Press.
Allwood, J., Nivre, J., & Ahlsén, E. (1993). On the semantics and pragmatics of linguistic feedback. Journal of Semantics, 9(1), 1–26.
Article Google Scholar
Argyle, M., & Cook, M. (1976). Gaze and mutual gaze. Cambridge: Cambridge University Press.
Google Scholar
Aung, M. S. H., Bianchi-Berthouze, N., Watson, P., & Williams, A. C. D. C. (2014). Automatic recognition of fear-avoidance behaviour in chronic pain physical rehabilitation. In Proceedings of 8th international conference on pervasive computing tehcologies for healthcare.
Boersma, P., & Weenink, D. (2009). Praat: Doing phonetics by computer (version 5.1.05) [computer program]. Retrieved May 1, 2009. From http://www.praat.org/.
Bolinger, D. (1986). Intonation and its parts: Melody in spoken English. Stanford, CA: Stanford.
Google Scholar
Bourbakis, N., Esposito, A., & Kavraki, D. (2011). Extracting and associating meta-features for understanding people’s emotional behaviour: Face and speech. Journal of Cognitive Computation, 3, 436–448.
Article Google Scholar
Bunt, H., Alexandersson, J., Choe, J. W., Fang, A. C., Hasida, K., Petukhova, V., et al. (2012). Iso 24617-2: A semantically-based standard for dialogue annotation. In N. Calzolari, K. Choukri, T. Declerck, M. U. Doğan, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), LREC, Citeseer (pp. 430–437). European Language Resources Association (ELRA).
Campbell, N., & Scherer, S. (2010). Comparing measures of synchrony and alignment in dialogue speech timing with respect to turn-taking activity. In Proceedings of Iiterspeech (pp. 2546–2549).
Cavicchio, F., & Poesio, M. (2009). Multimodal corpora annotation: Validation methods to assess coding scheme reliability. In M. Kipp, J. C. Martin, P. Paggio, & D. Heyen (Eds.), Multimodal corpora. Lecture notes in computer science (Vol. 5509). Berlin: Springer.
Cerrato, L. (2007). Investigating communicative feedback phenomena across languages and modalities. Ph.D. thesis, School of Speech and Music Communication, Stockholm, KT.
Cienki, A., & Müller, C. (2008). Metaphor and gesture. Amsterdam: Benjamins.
Book Google Scholar
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
Article Google Scholar
Dancey, C. P., & Reidy, J. (2004). Statistics without maths for psychology: Using spss for windows. Upper Saddle River, NJ: Prentice-Hall Inc.
Google Scholar
De Ruiter, J. P. (2000). The production of gesture and speech. In D. McNeill (Ed.), Language and gesture. Cambridge: Cambridge University Press.
Duncan Jr., S., & Fiske, D. (1977). Face-to-face interaction. Hillsdale, NJ: Erlbaum.
Duncan, S. (1972). Some signals and rules for taking speaking turns in conversations. Journal of Personality and Social Psychology, 23(2), 283–292.
Article Google Scholar
Duncan, S., Cassell, J., & Levy, E. (2007). Gesture and the dynamic dimension of language. Amsterdam: Benjamins.
Book Google Scholar
Ebert, C., Evert, S., & Wilmes, K. (2011). Focus marking via gestures. In I. Reich et al. (Eds.), Proceedings of Sinn & Bedeutung 15 (pp. 193–208). Saarbrücken, Germany: Universaar-Saarland University Press.
Ekman, P. (1992). An argument for basic emotions. Cognition and Emotion, 6(3/4), 169–200.
Ekman, P., & Friesen, W. (1975). Unmasking the face: A guide to recognizing emotions from facial clues. Upper Saddle River: Prentice-Hall.
Google Scholar
Ekman, P., & Friesen, W. V. (1969). The repertoire of nonverbal behavior: Categories, origins, usage, and coding. Semiotica, 1(1), 49–98.
Article Google Scholar
Enfield, N. J. (2012). TThe anatomy of meaning: Speech, gesture, and composite utterances. Cambridge: Cambridge University Press.
Google Scholar
Gibbon, D. (2011). Modelling gesture as speech: A linguistic approach. Poznań Studies in Contemporary Linguistics, 47, 470–508.
Google Scholar
Giorgolo, G., & Verstraten, F. A. (2008). Perception of ‘speech-and-gesture’ integration. In Proceedings of the international conference on auditory-visual speech processing 2008 (pp. 31–36).
Goodwin, C. (1981). Conversational organization: Interaction between speakers and hearers. New York: Academic Press.
Google Scholar
Gullberg, M., & de Bot, K. (Eds.). (2010). Gestures in language development. Amsterdam: Benjamins.
Google Scholar
Hadar, U., Steiner, T., & Rose, F. C. (1984). The timing of shifts of head postures during conversation. Human Movement Science, 3(3), 237–245.
Article Google Scholar
Hadar, U., Steiner, T. J., & Rose, F. C. (1985). Head movement during listening turns in conversation. Journal of Nonverbal Behavior, 9(4), 214–228.
Article Google Scholar
Jongejan, B. (2010). Automatic face tracking in anvil. In M. Kipp, J. C. Martin, P. Paggio, & D. Heylen (Eds.), Multimodal corpora: Advances in capturing, coding and analyzing multimodality (pp. 201–208). European Language Resources Association (ELRA), May 18, 2010.
Kendon, A. (1967). Some functions of gaze-direction in social interaction. Acta Psychologica, 26, 22–63.
Article Google Scholar
Kendon, A. (1978). Differential perception and attentional frame: Two problems for investigation. Semiotica, 24, 305–315.
Article Google Scholar
Kendon, A. (1980). Gesture and speech: Two aspects of the process of utterance. In M. R. Key (Ed.), Nonverbal communication and language (pp. 207–227). Mouton.
Kendon, A. (2004). Gesture. Cambridge: Cambridge University Press.
Google Scholar
Kipp, M. (2004). Gesture generation by Imitation—From human behavior to computer character animation. Boca Raton, FL: Dissertation.com.
Kipp, M., & Martin, J. C. (2009). Gesture and emotion: Can basic gestural form features discriminate emotions? In Proceedings of the international conference on affective computing and intelligent interaction (ACII-09). IEEE Press.
Kita, S., & Özyürek, A. (2003). What does cross-linguistic variation in semantic coordination of speech and gesture reveal?: Evidence for an interface representation of spatial thinking and speaking. Journal of Memory and Language, 48(1), 16–32.
Article Google Scholar
Kousidis, S., Malisz, Z., Wagner, P., & Schlangen, D. (2013). 2013. Exploring annotation of head gesture forms in spontaneous human interaction. In Proceedings of the Tilburg gesture meeting (TiGeR).
Leonard, T., & Cummins, F. (2010). The temporal relation between beat gestures and speech. Language and Cognitive Processes, 26(10), 1457–1471.
Article Google Scholar
Levinson, S. (1983). Pragmmatics. Cambridge: Cambridge University Press.
Google Scholar
Loehr, D. P. (2004). Gesture and intonation. Ph.D. thesis, Georgetown University.
Loehr, D. P. (2007). Aspects of rhythm in gesture and speech. Gesture, 7(2), 179–214.
Lucey, P., Cohn, J. F., Prkachin, K. M., Solomon, P. E., Chew. S., & Matthews, I. (2012). Painful monitoring: Automatic pain monitoring using the UNBC-McMaster shoulder pain expression archive database. Image and Vision Computing, 30(3), 197–205.
Article Google Scholar
Maynard, S. K. (1987). Interactional functions of a nonverbal sign: Head movement in Japanese dyadic casual conversation. Journal of Pragmatics, 11, 589–606.
Article Google Scholar
McClave, E. Z. (2000). Linguistic functions of head movements in the context of speech. Journal of Pragmatics, 32(7), 855–878.
Article Google Scholar
McNeill, D. (1992). Hand and mind: What gestures reveal about thought. Chicago: University of Chicago Press.
Google Scholar
McNeill, D. (2005). Gesture and thought. Chicago: University of Chicago Press.
Book Google Scholar
Navarretta, C. (2011). Annotating non-verbal behaviours in informal interactions. In I. A. Esposito, A. Vinciarelli, K. Vicsi, C. Pelachaud, & A. Nijholt (Eds.) Analysis of verbal and nonverbal communication and enactment: The processing issues, LNCS (Vol. 6800, pp. 317–324). Berlin: Springer.
Navarretta, C. (2012). Annotating and analyzing emotions in a corpus of first encounters. In IEEE (Ed.) Proceedings of the 3rd IEEE international conference on cognitive infocommunications (pp. 433–438), Kosice.
Navarretta, C. (2013a). Predicting speech overlaps from speech tokens and co-occurring body behaviours in dyadic conversations. In Proceedings of ACM international conference on multimodal interaction (ICMI 2013) (pp. 157–163). Sidney: ACM.
Navarretta, C. (2013b). Transfer learning in multimodal corpora. In IEEE (Ed.) Proceedings of the 4th IEEE international conference on cognitive infocommunications (CogInfoCom2013) (pp. 195–200). Hungary: Budapest.
Navarretta, C. (2014). Predicting emotions in facial expressions from the annotations in naturally occurring first encounters. Knowledge Based Systems, 71, 34–40.
Article Google Scholar
Navarretta, C., Ahlsén, E., Allwood, J., Jokinen, K., & Paggio, P. (2012). Feedback in Nordic first-encounters: A comparative study (pp. 2494–2499). Istanbul: European language resources distribution agency.
Navarretta, C., & Paggio, P. (2012). Verbal and non-verbal feedback in different types of interactions. In Proceedings of LREC 2012 (pp. 2338–2342). Istanbul.
Navarretta, C., & Paggio, P. (2013a). Classifying multimodal turn management in Danish dyadic first encounters. In NEALT proceedings of the 19th nordic conference of computational linguistics (Nodalida 2013), Oslo, Linköping electronic conference proceedings (pp. 133–146).
Navarretta, C., & Paggio, P. (2013b). Multimodal turn management in Danish dyadic first encounters. In NEALT proceedings. Northern European association for language and technology, Proceedings of the fourth nordic symposium of multimodal communication, Göthenburg, Linköping electronic conference proceedings (pp. 5–12).
Paggio, P. (2006a). Annotating information structure in a corpus of spoken Danish. In Proceedings of the 5th international conference on Language Resources and Evaluation LREC2006 (pp. 1606–1609). Italy: Genova.
Paggio, P. (2006b). Information structure and pauses in a corpus of spoken Danish. In Conference companion of the 11th conference of the European chapter of the association for computational linguistics (pp. 191–194). Italy: Trento.
Paggio, P. (2016). Coordination of head movements and speech in first encounter dialogues. In E. Gilmartin, L. Cerrato, & N. Campbell (Eds.), Proceedings from the 3rd European Symposium on Multimodal Communication, Dublin, September (pp. 69–74). Linköpings universitet: Linköping University Electronic Press.
Paggio, P., Allwood, J., Ahlsén, E., Jokinen, K., & Navarretta, C. (2010). The NOMCO multimodal nordic resource—Goals and characteristics. In Proceedings of the seventh conference on international language resources and evaluation (LREC’10). European Language Resources Association (ELRA), Valletta.
Paggio, P., & Diderichsen, P. (2010). Information structure and communicative functions in spoken and multimodal data. In P.J. Henriksen (Ed.), Linguistic theory and raw sound, Copenhagen studies in language (Vol. 49, pp. 149–168). Frederiksberg: Samfundslitteratur.
Paggio, P., & Navarretta, C. (2011). Head Movements, facial expressions and feedback in Danish first encounters interactions: A culture-specific analysis. In Lecture notes in computer science (Vol. 6766, pp. 583–590). Springer.
Paggio, P., & Navarretta, C. (2012). Classifying the feedback function of head movements and face expressions. In LREC 2012 workshop multimodal corpora—How should multimodal corpora deal with the situation? (pp. 34–37). Istanbul: European language resources distribution agency.
Paggio, P., & Vella, A. (2014). Overlaps in maltese conversational and task oriented dialogues. In P. Paggio & B. N. Wessel-Tolvig (Eds.), Proceedings from the 1st European symposium on multimodal communication University of Malta (pp. 55–64). Valletta: Linköping University Electronic Press.
Peirce, C. S. (1931). Elements of logic. Collected papers of Charles sanders peirce (Vol. 2). Cambridge: Harvard University Press.
Poggi, I. (2007). Hands, mind, face and body: A goal and belief view of multimodal communication. Berlin: Weidler.
Google Scholar
Russell, J. A., & Mehrabian, A. (1977). Evidence for a three-factor theory of emotions. Journal of Research in Personality, 11, 273–294.
Article Google Scholar
Savva, N., Scarinzi, A., & Bianchi-Berthouze, N. (2012). Continuous recognition of player’s affective body expression as dynamic quality of aesthetic experience. IEEE Transactions on Computational Intelligence and AI in Games, 4(3), 199–212.
Article Google Scholar
Schegloff, E. A. (1984). On some gestures’ relation to talk. In J. M. Atkinson & J. Heritage (Eds.), Structures of social action (pp. 266–298). Cambridge: Cambridge University Press.
Google Scholar
Studsgård, A. L., & Navarretta, C. (2013). Annotating attitudes in the Danish NOMCO corpus of first encounters. In NEALT proceedings. Northern European association for language and technology, 4th Nordic symposium on multimodal communication (pp. 85–89). Linköping University Electronic Press.
Vallduví, E., & Engdahl, E. (1996). The linguistic realisation of information packaging. Linguistics, 34(3), 459–520.
Article Google Scholar
Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd edn.). San Francisco: Morgan Kaufmann.
Google Scholar

Download references

Acknowledgments

The NOMCO project was funded by NOS-HS NORDCORP. We would like to acknowledge our partners from the Universities of Gothenburg and Helsinki, the annotators of the Danish data Sara Andersen, Josephine B. Arrild, Anette Studsgård and Bjørn N. Wesseltolvig. We would also like to thank the two anonymous reviewers for their helpful comments.

Author information

Authors and Affiliations

University of Copenhagen, Copenhagen, Denmark
Patrizia Paggio & Costanza Navarretta
University of Malta, Msida, Malta
Patrizia Paggio

Authors

Patrizia Paggio
View author publications
You can also search for this author in PubMed Google Scholar
Costanza Navarretta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Patrizia Paggio.

Appendix

See Table 20.

Table 20 Gesture counts

Full size table

Table 20 displays sums of the various gesture types in the corpus. Note that the total number of facial expressions is in fact 1448: to the 981 expressions that are annotated with one of the general facial features, must be added 467 expressions that are only annotated with a feature related to the eyebrows. Conversely, there 856 facial expressions with no eyebrow annotation. Similarly for body posture, there are 982 behaviours in total: to the 888 movements annotated with a body posture feature must be added 94 shoulder movements with not body posture annotation, while there are 826 body posture annotations not associated with a shoulder movement.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Paggio, P., Navarretta, C. The Danish NOMCO corpus: multimodal interaction in first acquaintance conversations. Lang Resources & Evaluation 51, 463–494 (2017). https://doi.org/10.1007/s10579-016-9371-6

Download citation

Published: 19 October 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s10579-016-9371-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Danish NOMCO corpus: multimodal interaction in first acquaintance conversations

Abstract

Access this article

Similar content being viewed by others

The ALICO corpus: analysing the active listener

Multimodal Behaviours in Comparable Danish and Polish Human-Human Triadic Spontaneous Interactions

The Corpus of Interactional Data: A Large Multimodal Annotated Resource

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The Danish NOMCO corpus: multimodal interaction in first acquaintance conversations

Abstract

Access this article

Similar content being viewed by others

The ALICO corpus: analysing the active listener

Multimodal Behaviours in Comparable Danish and Polish Human-Human Triadic Spontaneous Interactions

The Corpus of Interactional Data: A Large Multimodal Annotated Resource

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation