Skip to main content

Advertisement

Log in

Switching Wizard of Oz for the online evaluation of backchannel behavior

  • Original Paper
  • Published:
Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Abstract

The Switching Wizard of Oz (SWOZ) is a setup to evaluate human behavior synthesis algorithms in online face-to-face interactions. Conversational partners are represented to each other as virtual agents, whose animated behavior is either based on a synthesis algorithm, or driven by the actual behavior of the conversational partner. Human and algorithm have the same expression capabilities. The source is switched at random intervals, which means that the algorithm’s behavior can only be identified when it deviates from what is regarded as appropriate. The SWOZ approach is especially suitable for the controlled evaluation of synthesis algorithms that consider a limited set of behaviors. We evaluate a backchannel synthesis algorithm for speaker–listener dialogs using an asymmetric version of the framework. Human speakers talk to virtual listeners, that are either controlled by human listeners or by an algorithm. Speakers indicate when they feel they are no longer talking to a human listener. Analysis of these responses reveals patterns of inappropriate behavior in terms of quantity and timing of backchannels. These insights can be used to improve synthesis algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Bailenson JN, Yee N, Patel K, Beall AC (2008) Detecting digital chameleons. Comput Hum Behav 24(1):66–87

    Google Scholar 

  2. Bavelas JB, Coates L, Johnson T (2002) Listener responses as a collaborative process: the role of gaze. J Commun 52(3):566–580

    Article  Google Scholar 

  3. Bente G, Krämer NC, Petersen A, de Ruiter JP (2001) Computer animated movement and person perception: methodological advances in nonverbal behavior research. J Nonverbal Behav 25(3):151–166

    Article  Google Scholar 

  4. Brunner LJ (1979) Smiles can be back channels. J Pers Soc Psychol 37(5):728–734

    Article  MathSciNet  Google Scholar 

  5. Cathcart N, Carletta J, Klein E (2003) A shallow model of backchannel continuers in spoken dialogue. In: Proceedings of the conference of the European chapter of the association for computational linguistics, Budapest, Hungary, vol 1, pp 51–58

  6. Chang CC, Lin CJ (2011) LibSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27

    Article  Google Scholar 

  7. Dahlbäck N, Jönsson A, Ahrenberg L (1993) Wizard of Oz studies: why and how. In: Proceedings of the international conference on intelligent user interfaces (IUI), Orlando, FL, pp 193–200

  8. Duncan S Jr (1974) On the structure of speaker–auditor interaction during speaking turns. Lang Soc 3(2):161–180

    Article  Google Scholar 

  9. de Kok I, Ozkan D, Heylen D, Morency LP (2010) Learning and evaluating response prediction models using parallel listener consensus. In: Proceedings of the international conference on multimodal interfaces (ICMI), Beijing, China

  10. de Kok I, Poppe R, Heylen D (2012) Iterative perceptual learning for social behavior synthesis. Technical report, TR-CTIT-12-01, University of Twente

  11. Edlund J, Beskow J (2009) Mushypeek: a framework for online investigation of audiovisual dialogue phenomena. Lang Speech 52(2–3):351–367

    Article  Google Scholar 

  12. Heylen D, Bevacqua E, Pelachaud C, Poggi I, Gratch J, Schröder M (2011) Generating listening behaviour. In: Cowie R, Pelachaud C, Petta P (eds) Emotion-oriented systems cognitive technologies. Springer, Berlin, pp 321–347

  13. Hoai M, la Torre FD (2012) Max-margin early event detectors. In: Proceedings of the conference on computer vision and pattern recognition (CVPR), Providence, RI, pp 2863–2870

  14. Huang L, Morency LP, Gratch J (2010) Learning backchannel prediction model from parasocial consensus sampling: a subjective evaluation. In: Proceedings of the international conference on interactive virtual agents (IVA), Philadelphia, PA, pp 159–172

  15. Huang L, Morency LP, Gratch J (2011) Virtual rapport 2.0. In: Proceedings of the international conference on interactive virtual agents (IVA), Reykjavik, Iceland, pp 68–79

  16. Krauss RM, Garlock CM, Bricker PD, McMahon LE (1977) The role of audible and visible back-channel responses in interpersonal communication. J Pers Soc Psychol 35(7):523–529

    Article  Google Scholar 

  17. Li HZ (2006) Backchannel responses as misleading feedback in intercultural discourse. J Intercult Commun Res 35(2):99–116

    Article  Google Scholar 

  18. Martin JC, Paggio P, Kuehnlein P, Stiefelhagen R, Pianesi F (2008) Introduction to the special issue on multimodal corpora for modeling human multimodal behavior. Lang Resour Eval 42(2):253–264

    Article  Google Scholar 

  19. McDonnell R, Ennis C, Dobbyn S, O’Sullivan C (2009) Talking bodies: sensitivity to desynchronization of conversations. ACM Trans Appl Percept 6(4):A22

    Article  Google Scholar 

  20. McKeown G, Valstar M, Cowie R, Pantic M, Schröder M (2012) The SEMAINE database: annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Trans Affect Comput 3(1):5–17

    Article  Google Scholar 

  21. Morency LP, de Kok I, Gratch J (2010) A probabilistic multimodal approach for predicting listener backchannels. Auton Agents Multi-Agent Syst 20(1):80–84

    Article  Google Scholar 

  22. Poppe R, ter Maat M, Heylen D (2012) Online backchannel synthesis evaluation with the Switching Wizard of Oz. In: Joint proceedings of the intelligent virtual agents (IVA) 2012 workshops, Santa Cruz, CA, pp 75–82

  23. Poppe R, ter Maat M, Heylen D (2012) Online behavior evaluation with the switching wizard of Oz. In: Proceedings of the international conference on interactive virtual agents (IVA), Santa Cruz, CA, pp 486–488

  24. Poppe R, Truong KP, Heylen D (2013) Perceptual evaluation of backchannel strategies for artificial listeners. J Auton Agents Multi-Agent Syst 27(2):235–253

    Article  Google Scholar 

  25. Schedl M (2006) The CoMIRVA toolkit for visualizing music-related data. Technical report, Department of Computational Perception, Johannes Kepler University Linz

  26. Truong KP, Poppe R, de Kok I, Heylen D (2011) A multimodal analysis of vocal and visual backchannels in spontaneous dialogs. In: Proceedings of interspeech, Florence, Italy, pp 2973–2976

  27. Turing AM (1950) Computing machinery and intelligence. Mind 59(236):433–460

    Article  MathSciNet  Google Scholar 

  28. van Welbergen H, Reidsma D, Ruttkay Z, Zwiers J (2010) Elckerlyc—a BML realizer for continuous, multimodal interaction with a virtual human. J Multimodal User Interfaces 3(4):271–284

    Article  Google Scholar 

  29. Wang Z, Lee J, Marsella S (2013) Multi-party, multi-role comprehensive listening behavior. J Auton Agents Multi-Agent Syst 27(2):218–234

    Article  Google Scholar 

  30. Ward N, Tsukahara W (2000) Prosodic features which cue back-channel responses in English and Japanese. J Pragmat 32(8):1177–1207

    Article  Google Scholar 

  31. Xudong D (2009) The pragmatics of interaction. chap. Listener response. John Benjamins Publishing, Amsterdam, pp 104–124

  32. Yngve VH (1970) On getting a word in edgewise. In: Papers from the sixth regional meeting of Chicago Linguistic Society. Chicago Linguistic Society, Chicago, pp 567–577

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ronald Poppe.

Additional information

This publication was supported by the Dutch national program COMMIT. Preliminary versions of this work appeared as [22, 23]

Rights and permissions

Reprints and permissions

About this article

Cite this article

Poppe, R., ter Maat, M. & Heylen, D. Switching Wizard of Oz for the online evaluation of backchannel behavior. J Multimodal User Interfaces 8, 109–117 (2014). https://doi.org/10.1007/s12193-013-0131-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12193-013-0131-2

Keywords

Navigation