Skip to main content

Explorations in multiparty casual social talk and its relevance for social human machine dialogue

Abstract

Much talk between humans is face-to-face, casual, multiparty, and of indefinite duration. Such casual conversation or social talk facilitates social bonding and mutual co-presence rather than strictly being used to exchange information in order to complete well-defined practical tasks. Artificial partners capable of participating as a speaker or listener in such talk would be useful for companionship, educational, and social contexts. However, to adequately model social talk, such applications require dialogue structure beyond simple question/answer routines. While there is a body of theory on multiparty casual talk, there is a lack of quantitative work in the area. Our work focuses on the anatomy of casual talk, in particular phases of chat, highly interactive dialogue exchanges, and chunks, longer contributions from single participants in the dialogue. We outline the current knowledge on the structure of casual talk and describe our investigations in this domain. Our research finds that distributions of the durations of chat and chunk phases vary with chat being shorter than chunk phases. Chat is also more common at the start of conversations, with chunks becoming more prominent as the conversation progresses. Laughter and overlap are more common in chat phases than chunk phases. We discuss how these insights can inform the design and implementation of truly social machine dialogue partners.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Notes

  1. Research that examines dialogue act labelling acknowledges the same distinction in attempting to discriminate backchannels from agreements [57]. However, an utterance that might be labelled as “open-question”, “how about you?” (see [57, p. 341]) could as well be part of a chat phase or a chunk phase of a dialogue, but chunk phase questions presumably create greater burdens for informative system response.

References

  1. Abercrombie D (1956) Problems and principles: studies in the teaching of English as a second language. Longmans, Green, London

    Google Scholar 

  2. Akira H, Vogel C, Luz S, Campbell N (2017) Speech rate comparison when talking to a system and talking to a human: A study from a speech-to-speech, machine translation mediated map task. In: Proceedings of the 18th annual conference of the International Speech Communication Association (INTERSPEECH2017). International Speech Communication Association, pp 3286–3290, iSSN 2308-457X

  3. Allen J, Byron D, Dzikovska M, Ferguson G, Galescu L, Stent A (2000) An architecture for a generic dialogue shell. Nat Lang Eng 6(3, 4):213–228

    Article  Google Scholar 

  4. Allen JF, Schubert LK, Ferguson G, Heeman P, Hwang CH, Kato T, Light M, Martin N, Miller B, Poesio M, Traum DR (1995) The trains project: a case study in building a conversational planning agent. J Exp Theor Artif Intell 7(1):7–48

    Article  Google Scholar 

  5. Allen JF, Byron DK, Dzikovska M, Ferguson G, Galescu L, Stent A (2001) Toward conversational human–computer interaction. AI Mag 22(4):27

    Google Scholar 

  6. Allwood J, Björnberg M, Grönqvist L, Ahlsén E, Ottesjö C (2000) The spoken language corpus at the Department of Linguistics, Göteborg University. In: FQS–Forum Qualitative Social Research, vol 1

  7. Anderson A, Bader M, Bard E, Boyle E, Doherty G, Garrod S, Isard S, Kowtko J, McAllister J, Miller J (1991) The HCRC map task corpus. Lang Speech 34(4):351–366

    Article  Google Scholar 

  8. Aubrey AJ, Marshall D, Rosin PL, Vandeventer J, Cunningham DW, Wallraven C (2013) Cardiff conversation database (CCDb): a database of natural dyadic conversations. In: 2013 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 277–282

  9. Baker R, Hazan V (2011) DiapixUK: task materials for the elicitation of multiple spontaneous speech dialogs. Behav Res Methods 43(3):761–770

    Article  Google Scholar 

  10. Bakhtin MM (1986) The problem of speech genres. In: Emmerson C, Holquist M (eds) Speech genres and other late essays (trans: VW McGee). University of Texas Press, pp 60–102

  11. Beattie G (1983) Talk: an analysis of speech and non-verbal behaviour in conversation. Open University Press, Milton Keyne

    Google Scholar 

  12. Biber D, Johansson S, Leech G, Conrad S, Finegan E, Quirk R (1999) Longman grammar of spoken and written English, vol 2. Longman, London

    Google Scholar 

  13. Bickmore T, Cassell J (2005) Social dialongue with embodied conversational agents. In: van Kuppevelt JCJ, Dybkjær L, Bernsen NO (eds) Advances in natural multimodal dialogue systems. Springer, Dordrecht, pp 23–54

    Chapter  Google Scholar 

  14. Bickmore T, Schulman D, Yin L (2010) Maintaining engagement in long-term interventions with relational agents. Appl Artif Intell 24(6):648–666

    Article  Google Scholar 

  15. BNC. British national corpus. http://www.natcorp.ox.ac.uk/. Accessed 5 Sept 2018

  16. Boersma P, Weenink D (2010) Praat: doing phonetics by computer [Computer program]. Version 5.1.44

  17. Bonin F, Campbell N, Vogel C (2012) Laughter and topic changes: Temporal distribution and information flow. In: 2012 IEEE 3rd international conference on cognitive infocommunications (CogInfoCom), pp 53–58

  18. Brown G, Yule G (1983) Teaching the spoken language, vol 2. Cambridge University Press, Cambridge

    Google Scholar 

  19. Campbell N (2007) Approaches to conversational speech rhythm: speech activity in two-person telephone dialogues. In: Proceedings of 16th international congress of the phonetic sciences, Saarbrucken, Germany, pp 343–348

  20. Campbell N (2008) Multimodal processing of discourse information; the effect of synchrony. In: 2nd international symposium on universal communication, 2008. ISUC’08. pp 12–15

  21. Cheepen C (1988) The predictability of informal conversation. Pinter, London

    Google Scholar 

  22. Collins KJ, Traum D (2016) Towards a multi-dimensional taxonomy of stories in dialogue. In: Proceedings of the 10th international conference on language resources and evaluation (LREC), Portoroz, Slovenia, 23–28 May

  23. Deese J (1980) Pauses, prosody, and the demands of production in language. Mouton Publishers, Berlin

    Book  Google Scholar 

  24. Devillers L, Rosset S, Duplessis GD, Sehili MA, Bechade L, Delaborde A, Gossart C, Letard V, Yang F, Yemez Y et al (2015) Multimodal data collection of human–robot humorous interactions in the joker project. IEEE, pp 348–354

  25. DuBois JW, Chafe WL, Meyer C, Thompson SA (2000) Santa Barbara Corpus of Spoken American English. CD-ROM. Linguistic Data Consortium, Philadelphia

    Google Scholar 

  26. Dunbar R (1998) Grooming, gossip, and the evolution of language. Harvard University Press, Cambridge

    Google Scholar 

  27. Edlund J, Beskow J, Elenius K, Hellmer K, Strömbergsson S, House D (2010) Spontal: a Swedish spontaneous dialogue corpus of audio, video and motion capture. In: Proceedings of the 7th international conference on language resources and evaluation (LREC 2010)

  28. Eggins S, Slade D (2004) Analysing casual conversation. Equinox Publishing Ltd, Sheffield

    Google Scholar 

  29. Gilmartin E, Bonin F, Vogel C, Campbell N (2013) Laugher and topic transition in multiparty conversation. In: Proceedings of the SIGDIAL 2013 conference, Association for Computational Linguistics, Metz, France, pp 304–308

  30. Gilmartin E, Bonin F, Cerrato L, Vogel C, Campbell N (2015) What’s the game and who’s got the ball? genre in spoken interaction. In: 2015 AAAI Spring symposium series

  31. Gilmartin E, Cowan BR, Vogel C, Campbell N (2017) Exploring multiparty casual talk for social human–machine dialogue. In: Karpov A, Potapova R, Mporas I (eds) Speech and computer. Springer International Publishing, Cham, pp 370–378

    Chapter  Google Scholar 

  32. Godfrey JJ, Holliman EC, McDaniel J (1992) SWITCHBOARD: telephone speech corpus for research and development. In: 1992 IEEE international conference on acoustics, speech, and signal processing. ICASSP-92, vol 1, pp 517–520

  33. Greenbaum S (1991) ICE: The international corpus of English. English Today 28(7.4):3–7

    Article  Google Scholar 

  34. Grice HP (1975) Logic and conversation. In: Kimball JP, Cole P, Morgan JL (eds) Syntax and semantics. Vol. 3, speech acts. Academic Press, New York

    Google Scholar 

  35. Hayakawa SI (1990) Language in thought and action. Houghton Mifflin Harcourt, New York

    Google Scholar 

  36. Hennig S, Chellali R, Campbell N (2014) The D-ANS corpus: the Dublin-Autonomous nervous system corpus of biosignal and multimodal recordings of conversational speech. In: Proceedings of the 9th international conference on language resources and evaluation (LREC), Reykjavik, Iceland

  37. Jakobson R (1960) Linguistics and poetics. In: Sebeok TA (ed) Style in language. MIT Press, Cambridge, pp 350–377

    Google Scholar 

  38. Janin A, Baron D, Edwards J, Ellis D, Gelbart D, Morgan N, Peskin B, Pfau T, Shriberg E, Stolcke A (2003) The ICSI meeting corpus. In: 2003 IEEE international conference on acoustics, speech, and signal processing. Proceedings. (ICASSP’03), vol 1, pp I–364

  39. Koutsombogera M, Vogel C (2018) Modeling collaborative multimodal behavior in group dialogues: the MULTISIMO Corpus. In: Calzolari N, Choukri K, Cieri C, Declerck T, Goggi S, Hasida K, Isahara H, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, Tokunaga T (eds) 11th International conference on language resources and evaluation (LREC 2018), Miyazaki, Japan, 7–12 May 2018. European Language Resources Association (ELRA), pp 2945–2951

  40. Kruijff-Korbayova I, Oleari E, Baroni I, Kiefer B, Zelati MC, Pozzi C, Sanna A (2014) Effects of off-activity talk in human–robot interaction with diabetic children. In: 2014 RO-MAN: The 23rd IEEE international symposium on robot and human interactive communication. IEEE, pp 649–654

  41. Laskowski K (2011) Predicting, detecting and explaining the occurrence of vocal activity in multi-party conversation. Ph.D thesis, Carnegie Mellon University

  42. Laver J (1975) Communicative functions of phatic communion. In: Kendon A, Harris RM, Key MR (eds) Organization of behavior in face-to-face interaction. Mouton, Oxford, pp 215–238

    Google Scholar 

  43. Malinowski B (1936) The problem of meaning in primitive languages. In: The meaning of meaning: a study of the influence of language upon thought and of the science of symbolism, 4th edn. Kegan Paul, Trench, Trübner, London, pp 296–336

  44. Martin JG (1970) On judging pauses in spontaneous speech. J Verbal Learn Verbal Behav 9(1):75–78

    Article  Google Scholar 

  45. Mattar N, Wachsmuth I (2012) Small talk is more than chit-chat. In: Glimm B, Krüger A (eds) KI 2012: advances in artificial intelligence. Springer, Berlin, pp 119–130

    Chapter  Google Scholar 

  46. McCowan I, Carletta J, Kraaij W, Ashby S, Bourban S, Flynn M, Guillemot M, Hain T, Kadlec J, Karaiskos V (2005) The AMI Meeting Corpus. In: Proceedings of the 5th international conference on methods and techniques in behavioral research, vol 88

  47. Oertel C, Cummins F, Edlund J, Wagner P, Campbell N (2010) D64: a corpus of richly recorded conversational interaction. J Multimodal User Interfaces 7:1–10

    Google Scholar 

  48. Oppermann D, Schiel F, Steininger S, Beringer N (2001) Off-talk—a problem for human–machine-interaction? In: EUROSPEECH-2001: 7th European conference on speech communication and technology, pp 2197–2200

  49. Paggio P, Allwood J, Ahlsén E, Jokinen K, Navarretta C (2010) The NOMCO multimodal Nordic resource-goals and characteristics. In: Proceedings of the 7th conference on international language resources and evaluation (LREC 10), Valletta, Malta, 19–21 May

  50. Porcheron M, Fischer JE, Sharples S (2017) Do animals have accents? talking with agents in multi-party conversation. In: Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing. ACM, pp 207–219

  51. Raux A, Bohus D, Langner B, Black AW, Eskenazi M (2006) Doing research on a deployed spoken dialogue system: one year of lets go! experience. In: Proceedings of Interspeech, pp 65–68

  52. Rühlemann C, Gries S (2015) Turn order and turn distribution in multi-party storytelling. J Pragmat 87:171–191

    Article  Google Scholar 

  53. Schegloff E, Sacks H (1973) Opening up closings. Semiotica 8(4):289–327

    Article  Google Scholar 

  54. Schneider KP (1988) Small talk: analysing phatic discourse, vol 1. Hitzeroth, Marburg

    Google Scholar 

  55. Schulman D, Bickmore T (2010) Modeling behavioral manifestations of coordination and rapport over multiple conversations. In: Intelligent virtual agents, pp 132–138

    Chapter  Google Scholar 

  56. Slade D (2007) The texture of casual conversation: a multidimensional interpretation. Equinox, Sheffield

    Google Scholar 

  57. Stolke A, Ries K, Coccaro N, Shriberg E, Bates R, Jurafsky D, Taylor P, Martin R, Van Ess-Dykema C, Meteer M (2000) Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput Linguist 26(3):339–373

    Article  Google Scholar 

  58. Thornbury S, Slade D (2006) Conversation: from description to pedagogy. Cambridge University Press, Cambridge

    Book  Google Scholar 

  59. Ventola E (1979) The structure of casual conversation in English. J Pragmat 3(3):267–298

    Article  Google Scholar 

  60. Walker MA, Passonneau R, Boland JE (2001) Quantitative and qualitative evaluation of DARPA communicator spoken dialogue systems. In: Proceedings of the 39th annual meeting on Association for Computational Linguistics, Association for Computational Linguistics, Stroudsburg, PA, USA, ACL ’01, pp 515–522

  61. Williams J, Raux A, Henderson M (2016) The dialog state tracking challenge series: a review. Dialogue Discourse 7(3):4–33

    Google Scholar 

  62. Wilson J (1989) On the boundaries of conversation, vol 10. Pergamon, Oxford

    Google Scholar 

  63. Wittenburg P, Brugman H, Russel A, Klassmann A, Sloetjes H (2006) ELAN: a professional framework for multimodality research. In: Proceedings of the 5th international conference on language resources and evaluation (LREC)

  64. Włodarczak M, Laskowski K, Heldner M, Aare K (2017) Improving prediction of speech activity using multi-participant respiratory state. In: INTERSPEECH 2017, The International Speech Communication Association (ISCA), pp 1666–1670

  65. Yu Z, Papangelis A, Rudnicky A (2015) TickTock: a non-goal-oriented multimodal dialog system with engagement awareness. In: 2015 AAAI Spring symposium series

  66. Yu Z, Xu Z, Black AW, Rudnicky A (2016) Strategy and policy learning for non-task-oriented conversational systems. In: Proceedings of the 17th annual meeting of the special interest group on discourse and dialogue

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emer Gilmartin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gilmartin, E., Cowan, B.R., Vogel, C. et al. Explorations in multiparty casual social talk and its relevance for social human machine dialogue. J Multimodal User Interfaces 12, 297–308 (2018). https://doi.org/10.1007/s12193-018-0274-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12193-018-0274-2

Keywords

  • Speech interfaces
  • Dialogue modelling
  • Casual social talk