Abstract
This paper presents an experimental design and setup that explores the interaction between two children and their tutor during a question–answer session of a reading comprehension task. The multimodal aspects of the interactions are analysed in terms of preferred signals and strategies that speakers employ to carry out successful multi-party conversations. This analysis will form the basis for the development of behavioral models accounting for the specific context. We envisage the integration of such models into intelligent, context-aware systems, i.e. an embodied dialogue system that has the role of a tutor and is able to carry out a discussion in a multiparty setting by exploring the multimodal signals of the children. This system will have the ability to discuss a text and address questions to the children, encouraging collaboration and equal participation in the discussion and assessing the answers that the children give. The paper focuses on the design of the appropriate setup, the data collection and the analysis of the multimodal signals that are important for the realization of such a system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
“What do you consider an ideal tutor? How would you like a tutor-avatar or a tutor-robot? We’ll guide you through our recording studio where we’ll record the way a tutor communicates with the students with the goal of designing a virtual tutor. Can you help us design the tutor of the future? All you have to do is to answer a few simple questions and you’ll have the chance to see how we collect the data that are necessary to develop a virtual tutor.”
- 3.
References
Clifford N, Steuer J, Tauber E (1994) Computers are social actors. In: Adelson B, Dumais S, Olson J (eds) CHI ’94 Proceedings. of the SIGCHI conference on human factors in computing systems, Boston, April 1994. ACM Press, pp 72–78
Breazeal C (2003) Emotion and sociable humanoid robots. Int J Hum Comput Stud 59(1–2):119–155
Cohen P, Oviatt S (1995) The role of voice input for human-machine communication. Proc Natl Acad Sci 92(22):9921–9927
Kapoor A, Picard RW (2005) Multimodal affect recognition in learning environments. In: MULTIMEDIA’05 Proceedings of the 13th annual ACM international conference on Multimedia, Singapore, November 2005. ACM press, pp 677–682
Castellano G et al (2013) Towards empathic virtual and robotic tutors. In: Chad Lane H et al (eds) Artificial intelligence in education, vol 7926, Lecture notes in artificial intelligence. Springer, Heidelberg, pp 733–736
Robins B et al (2005) Robotic assistants in therapy and education of children with autism: can a small humanoid robot help encourage social interaction skills? Univers Access Inform Soc 4(2):105–120
Cassell J (2009) Embodied conversational agents. MIT Press, Cambridge
Rudnicky A (2005) Multimodal dialogue systems. In: Minker W, Buhler W, Dybkjaer L (eds) Spoken multimodal human-computer dialogue in mobile environments, vol 28. Text, speech and language technology. Springer, Dordrecht, pp 3–11
Al Moubayed S et al. (2012) Furhat: a back-projected human-like robot head for multiparty human-machine interaction. In: Esposito A et al. (eds) Cognitive behavioural systems, vol 7403. Lecture notes in computer science. Springer. Heidelberg, pp 114–130
Oertel C et al (2013) D64: a corpus of richly recorded conversational interaction. J Multimodal User Interfaces 7:19–28
Edlund J et al. (2010) Spontal: a Swedish spontaneous dialogue corpus of audio, video and motion capture. In: Calzolari et al. (eds) LREC 2010 Proceedings of the seventh conference on international language resources and evaluation, Valetta, May 2010. ELRA, pp 2992–2995
Paggio P et al. (2010) The NOMCO multimodal Nordic resource - goals and characteristics. In: Calzolari et al. (eds) LREC 2010 Proceedings of the seventh conference on international language resources and evaluation valetta, May 2010. ELRA, pp 2968–2973
Carletta J (2007) Unleashing the killer corpus: experiences in creating the multi-everything AMI meeting corpus. J Lang Resour Eval 41(2):181–190
Wittenburg P et al. (2006) ELAN: a professional framework for multimodality research. In: Calzolari et al. (eds) LREC 2006 Proceedings of the fifth conference on International language resources and evaluation, Genoa, May 2006. ELRA, pp 1556–1559
Allwood et al. (2007) The mumin coding scheme for the annotation of feedback, turn management and sequencing phenomena. Multimodal corpora for modelling human multimodal behaviour. J Lang Resour Eval 41(3–4):273–287
Koutsombogera M et al. (2014) The tutorbot corpus - A corpus for studying tutoring behaviour in multiparty face-to-face spoken dialogue. In: Calzolari et al. (eds) LREC 2014 Proceedings of the ninth conference on international language resources and evaluation. Reykjavik, May 2014. ELRA, pp 4196–4201
Sacks H, Schegloff E, Jefferson G (1974) A simplest systematics for the organization of turn-taking in conversation. Language 50:696–735
Duncan S (1972) Some signals and rules for taking speaking turns in conversation. J Pers Soc Psychol 23:283–292
Goodwin C (1980) Restarts, pauses and the achievement of mutual gaze at turn-beginning. Sociol Inq 50(3–4):272–302
Bohus D, Horvitz E (2010) Facilitating multiparty dialog with gaze, gesture, and speech. In: ICMI-MLMI ’10 International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction, Beijing, November 2010. ACM Press, p 311
Allwood J, Nivre J, Ahlsén E (1993) On the semantics and pragmatics of linguistic feedback. J Semant 9(1):1–29
Koutsombogera M, Papageorgiou H (2010) Linguistic and non-verbal cues for the induction of silent feedback. In: Esposito A et al. (eds) Development of multimodal interfaces: active listening and synchrony, vol 5967. Lecture notes in computer science. Springer, Heidelberg, pp 327–336
Allwood J et al (2007) The analysis of embodied communicative feedback in multimodal corpora: a prerequisite for behavior simulation. J Lang Resour Eval 41(3–4):255–272
Al Moubayed S, Skantze G (2012) Perception of gaze direction for situated interaction. In: Gaze-In ’12 proceedings of the 4th workshop on eye gaze in intelligent human machine interaction, Santa Monica, October 2012. ACM Press, p 88
Johansson M, Skantze G, Gustafson J (2013) Head pose patterns in multiparty human-robot team-building interactions. In: Herrmann G et al. (eds) International conference on social robotics, Bristol, October 2013. Lecture notes in artificial intelligence, vol 8239. Springer International publishing, pp 351–360
Skantze G, Al Moubayed S (2012) IrisTK: a statechart-based toolkit for multi-party face-to-face interaction. In: ICMI’12 Proceedings of the 14th ACM international conference on multimodal interaction, Santa Monica, October 2012. ACM Press, pp 69–76
Acknowledgments
Research leading to these results has been funded in part by the Greek General Secretariat for Research and Technology, KRIPIS Action, under Grant No. 448306 (POLYTROPON). The authors would like to thank all the session subjects for their kind participation in the experiments. The authors would also like to express their appreciation to the reviewers for their valuable feedback and constructive comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Koutsombogera, M., Deligiannis, M., Giagkou, M., Papageorgiou, H. (2016). Towards Modelling Multimodal and Multiparty Interaction in Educational Settings. In: Esposito, A., Jain, L. (eds) Toward Robotic Socially Believable Behaving Systems - Volume II . Intelligent Systems Reference Library, vol 106. Springer, Cham. https://doi.org/10.1007/978-3-319-31053-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-31053-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31052-7
Online ISBN: 978-3-319-31053-4
eBook Packages: EngineeringEngineering (R0)