Journal on Multimodal User Interfaces

, Volume 12, Issue 4, pp 297–308 | Cite as

Explorations in multiparty casual social talk and its relevance for social human machine dialogue

  • Emer GilmartinEmail author
  • Benjamin R. Cowan
  • Carl Vogel
  • Nick Campbell
Original Paper


Much talk between humans is face-to-face, casual, multiparty, and of indefinite duration. Such casual conversation or social talk facilitates social bonding and mutual co-presence rather than strictly being used to exchange information in order to complete well-defined practical tasks. Artificial partners capable of participating as a speaker or listener in such talk would be useful for companionship, educational, and social contexts. However, to adequately model social talk, such applications require dialogue structure beyond simple question/answer routines. While there is a body of theory on multiparty casual talk, there is a lack of quantitative work in the area. Our work focuses on the anatomy of casual talk, in particular phases of chat, highly interactive dialogue exchanges, and chunks, longer contributions from single participants in the dialogue. We outline the current knowledge on the structure of casual talk and describe our investigations in this domain. Our research finds that distributions of the durations of chat and chunk phases vary with chat being shorter than chunk phases. Chat is also more common at the start of conversations, with chunks becoming more prominent as the conversation progresses. Laughter and overlap are more common in chat phases than chunk phases. We discuss how these insights can inform the design and implementation of truly social machine dialogue partners.


Speech interfaces Dialogue modelling Casual social talk 



  1. 1.
    Abercrombie D (1956) Problems and principles: studies in the teaching of English as a second language. Longmans, Green, LondonGoogle Scholar
  2. 2.
    Akira H, Vogel C, Luz S, Campbell N (2017) Speech rate comparison when talking to a system and talking to a human: A study from a speech-to-speech, machine translation mediated map task. In: Proceedings of the 18th annual conference of the International Speech Communication Association (INTERSPEECH2017). International Speech Communication Association, pp 3286–3290, iSSN 2308-457XGoogle Scholar
  3. 3.
    Allen J, Byron D, Dzikovska M, Ferguson G, Galescu L, Stent A (2000) An architecture for a generic dialogue shell. Nat Lang Eng 6(3, 4):213–228CrossRefGoogle Scholar
  4. 4.
    Allen JF, Schubert LK, Ferguson G, Heeman P, Hwang CH, Kato T, Light M, Martin N, Miller B, Poesio M, Traum DR (1995) The trains project: a case study in building a conversational planning agent. J Exp Theor Artif Intell 7(1):7–48CrossRefGoogle Scholar
  5. 5.
    Allen JF, Byron DK, Dzikovska M, Ferguson G, Galescu L, Stent A (2001) Toward conversational human–computer interaction. AI Mag 22(4):27Google Scholar
  6. 6.
    Allwood J, Björnberg M, Grönqvist L, Ahlsén E, Ottesjö C (2000) The spoken language corpus at the Department of Linguistics, Göteborg University. In: FQS–Forum Qualitative Social Research, vol 1Google Scholar
  7. 7.
    Anderson A, Bader M, Bard E, Boyle E, Doherty G, Garrod S, Isard S, Kowtko J, McAllister J, Miller J (1991) The HCRC map task corpus. Lang Speech 34(4):351–366CrossRefGoogle Scholar
  8. 8.
    Aubrey AJ, Marshall D, Rosin PL, Vandeventer J, Cunningham DW, Wallraven C (2013) Cardiff conversation database (CCDb): a database of natural dyadic conversations. In: 2013 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 277–282Google Scholar
  9. 9.
    Baker R, Hazan V (2011) DiapixUK: task materials for the elicitation of multiple spontaneous speech dialogs. Behav Res Methods 43(3):761–770CrossRefGoogle Scholar
  10. 10.
    Bakhtin MM (1986) The problem of speech genres. In: Emmerson C, Holquist M (eds) Speech genres and other late essays (trans: VW McGee). University of Texas Press, pp 60–102Google Scholar
  11. 11.
    Beattie G (1983) Talk: an analysis of speech and non-verbal behaviour in conversation. Open University Press, Milton KeyneGoogle Scholar
  12. 12.
    Biber D, Johansson S, Leech G, Conrad S, Finegan E, Quirk R (1999) Longman grammar of spoken and written English, vol 2. Longman, LondonGoogle Scholar
  13. 13.
    Bickmore T, Cassell J (2005) Social dialongue with embodied conversational agents. In: van Kuppevelt JCJ, Dybkjær L, Bernsen NO (eds) Advances in natural multimodal dialogue systems. Springer, Dordrecht, pp 23–54CrossRefGoogle Scholar
  14. 14.
    Bickmore T, Schulman D, Yin L (2010) Maintaining engagement in long-term interventions with relational agents. Appl Artif Intell 24(6):648–666CrossRefGoogle Scholar
  15. 15.
    BNC. British national corpus. Accessed 5 Sept 2018
  16. 16.
    Boersma P, Weenink D (2010) Praat: doing phonetics by computer [Computer program]. Version 5.1.44Google Scholar
  17. 17.
    Bonin F, Campbell N, Vogel C (2012) Laughter and topic changes: Temporal distribution and information flow. In: 2012 IEEE 3rd international conference on cognitive infocommunications (CogInfoCom), pp 53–58Google Scholar
  18. 18.
    Brown G, Yule G (1983) Teaching the spoken language, vol 2. Cambridge University Press, CambridgeGoogle Scholar
  19. 19.
    Campbell N (2007) Approaches to conversational speech rhythm: speech activity in two-person telephone dialogues. In: Proceedings of 16th international congress of the phonetic sciences, Saarbrucken, Germany, pp 343–348Google Scholar
  20. 20.
    Campbell N (2008) Multimodal processing of discourse information; the effect of synchrony. In: 2nd international symposium on universal communication, 2008. ISUC’08. pp 12–15Google Scholar
  21. 21.
    Cheepen C (1988) The predictability of informal conversation. Pinter, LondonGoogle Scholar
  22. 22.
    Collins KJ, Traum D (2016) Towards a multi-dimensional taxonomy of stories in dialogue. In: Proceedings of the 10th international conference on language resources and evaluation (LREC), Portoroz, Slovenia, 23–28 MayGoogle Scholar
  23. 23.
    Deese J (1980) Pauses, prosody, and the demands of production in language. Mouton Publishers, BerlinCrossRefGoogle Scholar
  24. 24.
    Devillers L, Rosset S, Duplessis GD, Sehili MA, Bechade L, Delaborde A, Gossart C, Letard V, Yang F, Yemez Y et al (2015) Multimodal data collection of human–robot humorous interactions in the joker project. IEEE, pp 348–354Google Scholar
  25. 25.
    DuBois JW, Chafe WL, Meyer C, Thompson SA (2000) Santa Barbara Corpus of Spoken American English. CD-ROM. Linguistic Data Consortium, PhiladelphiaGoogle Scholar
  26. 26.
    Dunbar R (1998) Grooming, gossip, and the evolution of language. Harvard University Press, CambridgeGoogle Scholar
  27. 27.
    Edlund J, Beskow J, Elenius K, Hellmer K, Strömbergsson S, House D (2010) Spontal: a Swedish spontaneous dialogue corpus of audio, video and motion capture. In: Proceedings of the 7th international conference on language resources and evaluation (LREC 2010)Google Scholar
  28. 28.
    Eggins S, Slade D (2004) Analysing casual conversation. Equinox Publishing Ltd, SheffieldGoogle Scholar
  29. 29.
    Gilmartin E, Bonin F, Vogel C, Campbell N (2013) Laugher and topic transition in multiparty conversation. In: Proceedings of the SIGDIAL 2013 conference, Association for Computational Linguistics, Metz, France, pp 304–308Google Scholar
  30. 30.
    Gilmartin E, Bonin F, Cerrato L, Vogel C, Campbell N (2015) What’s the game and who’s got the ball? genre in spoken interaction. In: 2015 AAAI Spring symposium seriesGoogle Scholar
  31. 31.
    Gilmartin E, Cowan BR, Vogel C, Campbell N (2017) Exploring multiparty casual talk for social human–machine dialogue. In: Karpov A, Potapova R, Mporas I (eds) Speech and computer. Springer International Publishing, Cham, pp 370–378CrossRefGoogle Scholar
  32. 32.
    Godfrey JJ, Holliman EC, McDaniel J (1992) SWITCHBOARD: telephone speech corpus for research and development. In: 1992 IEEE international conference on acoustics, speech, and signal processing. ICASSP-92, vol 1, pp 517–520Google Scholar
  33. 33.
    Greenbaum S (1991) ICE: The international corpus of English. English Today 28(7.4):3–7CrossRefGoogle Scholar
  34. 34.
    Grice HP (1975) Logic and conversation. In: Kimball JP, Cole P, Morgan JL (eds) Syntax and semantics. Vol. 3, speech acts. Academic Press, New YorkGoogle Scholar
  35. 35.
    Hayakawa SI (1990) Language in thought and action. Houghton Mifflin Harcourt, New YorkGoogle Scholar
  36. 36.
    Hennig S, Chellali R, Campbell N (2014) The D-ANS corpus: the Dublin-Autonomous nervous system corpus of biosignal and multimodal recordings of conversational speech. In: Proceedings of the 9th international conference on language resources and evaluation (LREC), Reykjavik, IcelandGoogle Scholar
  37. 37.
    Jakobson R (1960) Linguistics and poetics. In: Sebeok TA (ed) Style in language. MIT Press, Cambridge, pp 350–377Google Scholar
  38. 38.
    Janin A, Baron D, Edwards J, Ellis D, Gelbart D, Morgan N, Peskin B, Pfau T, Shriberg E, Stolcke A (2003) The ICSI meeting corpus. In: 2003 IEEE international conference on acoustics, speech, and signal processing. Proceedings. (ICASSP’03), vol 1, pp I–364Google Scholar
  39. 39.
    Koutsombogera M, Vogel C (2018) Modeling collaborative multimodal behavior in group dialogues: the MULTISIMO Corpus. In: Calzolari N, Choukri K, Cieri C, Declerck T, Goggi S, Hasida K, Isahara H, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, Tokunaga T (eds) 11th International conference on language resources and evaluation (LREC 2018), Miyazaki, Japan, 7–12 May 2018. European Language Resources Association (ELRA), pp 2945–2951Google Scholar
  40. 40.
    Kruijff-Korbayova I, Oleari E, Baroni I, Kiefer B, Zelati MC, Pozzi C, Sanna A (2014) Effects of off-activity talk in human–robot interaction with diabetic children. In: 2014 RO-MAN: The 23rd IEEE international symposium on robot and human interactive communication. IEEE, pp 649–654Google Scholar
  41. 41.
    Laskowski K (2011) Predicting, detecting and explaining the occurrence of vocal activity in multi-party conversation. Ph.D thesis, Carnegie Mellon UniversityGoogle Scholar
  42. 42.
    Laver J (1975) Communicative functions of phatic communion. In: Kendon A, Harris RM, Key MR (eds) Organization of behavior in face-to-face interaction. Mouton, Oxford, pp 215–238Google Scholar
  43. 43.
    Malinowski B (1936) The problem of meaning in primitive languages. In: The meaning of meaning: a study of the influence of language upon thought and of the science of symbolism, 4th edn. Kegan Paul, Trench, Trübner, London, pp 296–336Google Scholar
  44. 44.
    Martin JG (1970) On judging pauses in spontaneous speech. J Verbal Learn Verbal Behav 9(1):75–78CrossRefGoogle Scholar
  45. 45.
    Mattar N, Wachsmuth I (2012) Small talk is more than chit-chat. In: Glimm B, Krüger A (eds) KI 2012: advances in artificial intelligence. Springer, Berlin, pp 119–130CrossRefGoogle Scholar
  46. 46.
    McCowan I, Carletta J, Kraaij W, Ashby S, Bourban S, Flynn M, Guillemot M, Hain T, Kadlec J, Karaiskos V (2005) The AMI Meeting Corpus. In: Proceedings of the 5th international conference on methods and techniques in behavioral research, vol 88Google Scholar
  47. 47.
    Oertel C, Cummins F, Edlund J, Wagner P, Campbell N (2010) D64: a corpus of richly recorded conversational interaction. J Multimodal User Interfaces 7:1–10Google Scholar
  48. 48.
    Oppermann D, Schiel F, Steininger S, Beringer N (2001) Off-talk—a problem for human–machine-interaction? In: EUROSPEECH-2001: 7th European conference on speech communication and technology, pp 2197–2200Google Scholar
  49. 49.
    Paggio P, Allwood J, Ahlsén E, Jokinen K, Navarretta C (2010) The NOMCO multimodal Nordic resource-goals and characteristics. In: Proceedings of the 7th conference on international language resources and evaluation (LREC 10), Valletta, Malta, 19–21 MayGoogle Scholar
  50. 50.
    Porcheron M, Fischer JE, Sharples S (2017) Do animals have accents? talking with agents in multi-party conversation. In: Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing. ACM, pp 207–219Google Scholar
  51. 51.
    Raux A, Bohus D, Langner B, Black AW, Eskenazi M (2006) Doing research on a deployed spoken dialogue system: one year of lets go! experience. In: Proceedings of Interspeech, pp 65–68Google Scholar
  52. 52.
    Rühlemann C, Gries S (2015) Turn order and turn distribution in multi-party storytelling. J Pragmat 87:171–191CrossRefGoogle Scholar
  53. 53.
    Schegloff E, Sacks H (1973) Opening up closings. Semiotica 8(4):289–327CrossRefGoogle Scholar
  54. 54.
    Schneider KP (1988) Small talk: analysing phatic discourse, vol 1. Hitzeroth, MarburgGoogle Scholar
  55. 55.
    Schulman D, Bickmore T (2010) Modeling behavioral manifestations of coordination and rapport over multiple conversations. In: Intelligent virtual agents, pp 132–138CrossRefGoogle Scholar
  56. 56.
    Slade D (2007) The texture of casual conversation: a multidimensional interpretation. Equinox, SheffieldGoogle Scholar
  57. 57.
    Stolke A, Ries K, Coccaro N, Shriberg E, Bates R, Jurafsky D, Taylor P, Martin R, Van Ess-Dykema C, Meteer M (2000) Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput Linguist 26(3):339–373CrossRefGoogle Scholar
  58. 58.
    Thornbury S, Slade D (2006) Conversation: from description to pedagogy. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  59. 59.
    Ventola E (1979) The structure of casual conversation in English. J Pragmat 3(3):267–298CrossRefGoogle Scholar
  60. 60.
    Walker MA, Passonneau R, Boland JE (2001) Quantitative and qualitative evaluation of DARPA communicator spoken dialogue systems. In: Proceedings of the 39th annual meeting on Association for Computational Linguistics, Association for Computational Linguistics, Stroudsburg, PA, USA, ACL ’01, pp 515–522Google Scholar
  61. 61.
    Williams J, Raux A, Henderson M (2016) The dialog state tracking challenge series: a review. Dialogue Discourse 7(3):4–33Google Scholar
  62. 62.
    Wilson J (1989) On the boundaries of conversation, vol 10. Pergamon, OxfordGoogle Scholar
  63. 63.
    Wittenburg P, Brugman H, Russel A, Klassmann A, Sloetjes H (2006) ELAN: a professional framework for multimodality research. In: Proceedings of the 5th international conference on language resources and evaluation (LREC)Google Scholar
  64. 64.
    Włodarczak M, Laskowski K, Heldner M, Aare K (2017) Improving prediction of speech activity using multi-participant respiratory state. In: INTERSPEECH 2017, The International Speech Communication Association (ISCA), pp 1666–1670Google Scholar
  65. 65.
    Yu Z, Papangelis A, Rudnicky A (2015) TickTock: a non-goal-oriented multimodal dialog system with engagement awareness. In: 2015 AAAI Spring symposium seriesGoogle Scholar
  66. 66.
    Yu Z, Xu Z, Black AW, Rudnicky A (2016) Strategy and policy learning for non-task-oriented conversational systems. In: Proceedings of the 17th annual meeting of the special interest group on discourse and dialogueGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Speech Communication LabTrinity College DublinDublinIreland
  2. 2.University College DublinDublinIreland
  3. 3.Trinity College DublinDublinIreland
  4. 4.Trinity Centre for Computing and Language StudiesSchool of Computer Science and Statistics, Trinity College DublinDublinIreland

Personalised recommendations