Going Further in Affective Computing: How Emotion Recognition Can Improve Adaptive User Interaction

  • Sascha Meudt
  • Miriam Schmidt-Wack
  • Frank Honold
  • Felix Schüssel
  • Michael Weber
  • Friedhelm Schwenker
  • Günther Palm
Chapter
Part of the Intelligent Systems Reference Library book series (ISRL, volume 105)

Abstract

Thisarticlejoinsthefieldsofemotion recognition and human computer interaction. While much work has been done on recognizing emotions, they are hardly used to improve a user’s interaction with a system. Although the fields of affective computing and especially serious games already make use of detected emotions, they tend to provide application and user specific adaptions only on the task level. We present an approach of utilizing recognized emotions to improve the interaction itself, independent of the underlying application at hand. Examining the state of the art in emotion recognition research and based on the architecture of Companion-System, a generic approach for determining the main cause of an emotion within the history of interactions is presented, allowing a specific reaction and adaption. Using such an approach could lead to systems that use emotions to improve not only the outcome of a task but the interaction itself in order to be truly individual and empathic.

Notes

Acknowledgments

This work was supported by the Transregional Collaborative Research Center SFB/TRR 62 “Companion-Technology for Cognitive Technical Systems”, which is funded by the German Research Foundation (DFG).

References

  1. 1.
    Akgun M, Cagiltay K, Zeyrek D (2010) The effect of apologetic error messages and mood states on computer users’ self-appraisal of performance. J Pragmat 42(9):2430–2448. doi:10.1016/j.pragma.2009.12.011 (how people talk to Robots and Computers)
  2. 2.
    Altman EI, Marco G, Varetto F (1994) Corporate distress diagnosis: comparisons using linear discriminant analysis and neural networks (the italian experience). J Bank Financ 18(3):505–529CrossRefGoogle Scholar
  3. 3.
    Anderson K, André E, Baur T, Bernardini S, Chollet M, Chryssafidou E, Damian I, Ennis C, Egges A, Gebhard P, Jones H, Ochs M, Pelachaud C, Porayska-Pomsta K, Rizzo P, Sabouret N (2013) The tardis framework: Intelligent virtual agents for social coaching in job interviews. In: Reidsma D, Katayose H, Nijholt A (eds) Advances in computer entertainment. Lecture notes in computer science, vol 8253, Springer, Berlin, pp 476–491. doi:10.1007/978-3-319-03161-3_35
  4. 4.
    Atkinson A, Dittrich W, Gemmell A, Young A (2004) Emotion perception from dynamic and static body expressions in point-light and full-light displays. Perception 33(6):717–746. doi:10.1068/p5096 CrossRefGoogle Scholar
  5. 5.
    Bastide R, Palanque P (1999) A visual and formal glue between application and interaction. J Vis Lang Comput 10(5):481–507. doi:10.1006/jvlc.1999.0127 CrossRefGoogle Scholar
  6. 6.
    Becker-Asano C, Ishiguro H (2011) Evaluating facial displays of emotion for the android robot geminoid f. In: 2011 IEEE workshop on affective computational intelligence (WACI), pp 1–8. doi:10.1109/WACI.2011.5953147
  7. 7.
    Böck R, Gluge S, Wendemuth A, Limbrecht K, Walter S, Hrabal D, Traue HC (2012) Intraindividual and interindividual multimodal emotion analyses in human-machine-interaction. In: 2012 IEEE international multi-disciplinary conference on cognitive methods in situation awareness and decision support (CogSIMA), pp 59–64Google Scholar
  8. 8.
    Bradley MM, Lang PJ (1994) Measuring emotion: the self-assessment manikin and the semantic differential. J Behav Ther Exp Psychiatry 25(1):49–59CrossRefGoogle Scholar
  9. 9.
    Brave S, Nass C (2003) Emotion in Human-computer Interaction. In: Jacko JA, Sears A (eds) The human-computer interaction handbook, L. Erlbaum Associates Inc., Hillsdale, NJ, USA, pp 81–96Google Scholar
  10. 10.
    Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B (2005) A database of german emotional speech. In: Proceedings of interspeech, Lissabon, pp 1517–1520Google Scholar
  11. 11.
    Campbell N, Kashioka H, Ohara R (2005) No laughing matter. In: 9th European conference on speech communication and technology, INTERSPEECH 2005—Eurospeech, Lisbon, pp 465–468, 4–8 September 2005Google Scholar
  12. 12.
    Castellano G, Villalba S, Camurri A (2007) Recognising human emotions from body movement and gesture dynamics. In: Paiva A, Prada R, Picard RW (eds) Affective computing and intelligent interaction, vol 4738, Lecture notes in computer science, Springer, Berlin, pp 71–82Google Scholar
  13. 13.
    Cohen I, Sebe N, Garg A, Chen LS, Huang TS (2003) Facial expression recognition from video sequences: temporal and static modeling. Comput Vis Image Underst 91(1–2):160–187. doi:10.1016/S1077-3142(03)00081-X,specialIssueonFaceRecognition
  14. 14.
    Cohn JF, Kanade T, Tian Y (2000) Comprehensive database for facial expression analysis. In: Proceedings of fourth IEEE international conference on automatic face and gesture recognition, pp 46–53Google Scholar
  15. 15.
    Cortes C, Vapnik V (1995) Support-vector networks. Machine learning 20(3):273–297. doi:10.1023/A:1022627411411 Google Scholar
  16. 16.
    Coutaz J, Nigay L, Salber D, Blandford A, May J, Young RM (1995) Four easy pieces for assessing the usability of multimodal interaction: the CARE properties. In: Proceedings of INTERACT95, Lillehammer, pp 115–120Google Scholar
  17. 17.
    Cowie R, Douglas-Cowie E, Tsapatsoulis N, Votsis G, Kollias S, Fellenz W, Taylor J (2001) Emotion recognition in human-computer interaction. IEEE Signal Process Mag 18(1):32–80. doi:10.1109/79.911197 CrossRefGoogle Scholar
  18. 18.
    Darwin C (1872) The expression of the emotions in man and animals, 1st edn. Oxford University Press Inc, New YorkCrossRefGoogle Scholar
  19. 19.
    Dempster AP (1968) A generalization of bayesian inference. J R Stat Soc Ser B (Methodological), pp 205–247Google Scholar
  20. 20.
    Devillers L, Vidrascu L, Lamel L (2005) Challenges in real-life emotion annotation and machine learning based detection. Neural Netw 18(4):407–422. doi:10.1016/j.neunet.2005.03.007 (emotion and Brain)
  21. 21.
    Dey AK, Abowd GD (1999) Towards a better understanding of context and context-awareness. In: HUC ’99: Proceedings of the 1st international symposium on handheld and ubiquitous computing, Springer, Berlin, pp 304–307Google Scholar
  22. 22.
    Dietrich C, Schwenker F, Palm G (2001) Classification of time series utilizing temporal and decision fusion. In: Multiple classifier systems, Springer, Berlin, pp 378–387Google Scholar
  23. 23.
    Ekman P (1992) An argument for basic emotions. Cognit Emot 6(3–4):169–200. doi:10.1080/02699939208411068 CrossRefGoogle Scholar
  24. 24.
    Ekman P, Friesen WV (1978) Facial action coding system: a technique for the measurement of facial movement. Consulting Psychologists Press, Palo AltoGoogle Scholar
  25. 25.
    Endrass B, Haering M, Akila G, André E (2014) Simulating deceptive cues of joy in humanoid robots. In: Bickmore T, Marsella S, Sidner C (eds) Intelligent virtual agents, Lecture notes in computer science, vol 8637, Springer, Berlin, pp 174–177. doi:10.1007/978-3-319-09767-1_20
  26. 26.
    Frijda NH (1994) Varieties of affect: emotions and episodes, moods, and sentiments. In: Ekman P, Davidson RJ (eds) The nature of emotion, fundamental questions. Oxford University Press, New York, pp 197–202Google Scholar
  27. 27.
    de Gelder B (2006) Towards the neurobiology of emotional body language. Nat Rev Neurosci 7(3):242–249. doi:10.1038/nrn1872 CrossRefGoogle Scholar
  28. 28.
    Glodek M, Scherer S, Schwenker F, Palm G (2011) Conditioned hidden markov model fusion for multimodal classification. In: INTERSPEECH, pp 2269–2272Google Scholar
  29. 29.
    Glodek M, Schels M, Schwenker F, Palm G (2014) Combination of sequential class distributions from multiple channels using markov fusion networks. J Multimodal User Interfaces 8(3):257–272CrossRefGoogle Scholar
  30. 30.
    Glodek M, Honold F, Geier T, Krell G, Nothdurft F, Reuter S, Schüssel F, Hörnle T, Dietmayer K, Minker W, Biundo S, Weber M, Palm G, Schwenker F (2015) Fusion paradigms in cognitive technical systems for human-computer interaction. Neurocomputing 161:17–37. doi:10.1016/j.neucom.2015.01.076
  31. 31.
    Gram C, Cockton G (1997) Design principles for interactive software. Chapman & Hall Ltd, LondonMATHGoogle Scholar
  32. 32.
    Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182MATHGoogle Scholar
  33. 33.
    Honold F, Schüssel F, Panayotova K, Weber M (2012) The nonverbal toolkit: Towards a framework for automatic integration of nonverbal communication into virtual environments. In: 2012 8th international conference on intelligent environments (IE), pp 243–250. doi:10.1109/IE.2012.13
  34. 34.
    Honold F, Schüssel F, Weber M (2012) Adaptive probabilistic fission for multimodal systems. In: Proceedings of the 24th Australian computer-human interaction conference, OzCHI ’12, ACM, New York, pp 222–231. doi:10.1145/2414536.2414575
  35. 35.
    Honold F, Schüssel F, Weber M, Nothdurft F, Bertrand G, Minker W (2013) Context models for adaptive dialogs and multimodal interaction. In: 2013 9th International conference on intelligent environments (IE), IEEE, pp 57–64. doi:10.1109/IE.2013.54
  36. 36.
    Honold F, Bercher P, Richter F, Nothdurft F, Geier T, Barth R, Hörnle T, Schüssel F, Reuter S, Rau M, Bertrand G, Seegebarth B, Kurzok P, Schattenberg B, Minker W, Weber M, Biundo S (2014) Companion-technology: towards user—and situation-adaptive functionality of technical systems. In: 2014 10th International conference on, intelligent environments (IE), IEEE, pp 378–381. doi:10.1109/IE.2014.60
  37. 37.
    Honold F, Schüssel F, Weber M (2014) The automated interplay of multimodal fission and fusion in adaptive HCI. In: 2014 10th International conference on intelligent environments (IE), IEEE, China, pp 170–177. doi:10.1109/IE.2014.32
  38. 38.
    Itakura F (1975) Line spectrum representation of linear predictor coefficients of speech signals. J Acoust Soc Am 57(S1):S35–S35. doi:10.1121/1.1995189 CrossRefGoogle Scholar
  39. 39.
    Joachims T (2006) Transductive support vector machines. Chapelle et al. (2006), pp 105–118Google Scholar
  40. 40.
    Jolliffe I (2002) Principal component analysis. Wiley, New York. Online LibraryGoogle Scholar
  41. 41.
    Kächele M, Zharkov D, Meudt S, Schwenker F (2014) Prosodic, spectral and voice quality feature selection using a long-term stopping criterion for audio-based emotion recognition. In: 22nd international conference on pattern recognition, ICPR 2014, Stockholm, pp 803–808, 24–28 August 2014. doi:10.1109/ICPR.2014.148
  42. 42.
    Kandel E, Schwartz J, Jessell T (2000) Principles of neural science. McGraw-Hill, New YorkGoogle Scholar
  43. 43.
    Kelley JF (1984) An iterative design methodology for user-friendly natural language office information applications. ACM Trans Inf Syst 2(1):26–41MathSciNetCrossRefGoogle Scholar
  44. 44.
    Lisetti CL, Schiano DJ (2000) Automatic facial expression interpretation: where human-computer interaction, artificial intelligence and cognitive science intersect. Pragmat Cognit (Spec Issue Fac Inf Process Multidiscip Perspect) 8(1):185–235Google Scholar
  45. 45.
    Littlewort G, Whitehill J, Wu T, Fasel I, Frank M, Movellan J, Bartlett M (2011) The computer expression recognition toolbox (cert). In: 2011 IEEE international conference on automatic face gesture recognition and workshops (FG 2011), pp 298–305. doi:10.1109/FG.2011.5771414
  46. 46.
    Logan B (2000) Mel frequency cepstral coefficients for music modeling. In: ISMIRGoogle Scholar
  47. 47.
    Lugger M, Yang B, Wokurek W (2006) Robust estimation of voice quality parameters under realworld disturbances. In: 2006 IEEE international conference on acoustics, speech and signal processing, 2006. ICASSP 2006 Proceedings, vol 1, pp 1097–1100Google Scholar
  48. 48.
    Matsugu M, Mori K, Mitari Y, Kaneda Y (2003) Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Netw 16(5):555–559CrossRefGoogle Scholar
  49. 49.
    Meudt S, Schwenker F (2012) On instance selection in audio based emotion recognition. In: Proceedings, of the artificial neural networks in pattern recognition—5th INNS IAPR TC 3 GIRPR Workshop, ANNPR 2012, Trento, pp 186–192, 17–19 September 2012. doi:10.1007/978-3-642-33212-8_17
  50. 50.
    Meudt S, Bigalke L, Schwenker F (2012) Atlas—an annotation tool for hci data utilizing machine learning methods. Proceedings of the 1st international conference on affective and pleasurable design (APD’12) [jointly with the 4th international conference on applied human factors and ergonomics (AHFE’12)]. CRC Press, Advances in human factors and ergonomics series, pp 5347–5352Google Scholar
  51. 51.
    Mitra S, Acharya T (2007) Gesture recognition: a survey. IEEE Trans Syst Man Cybern Part C: Appl Rev 37(3):311–324. doi:10.1109/TSMCC.2007.893280
  52. 52.
    Nothdurft F, Bertrand G, Heinroth T, Minker W (2010) Geedi—guards for emotional and explanatory dialogues. In: Callaghan V, Kameas A, Egerton S, Satoh I, Weber M (eds) 2010 Sixth international conference on intelligent environments, IEEE Computer Society, pp 90–95. doi:10.1109/IE.2010.24
  53. 53.
    Ojala T, Pietikäinen M, Harwood D (1996) A comparative study of texture measures with classification based on featured distributions. Patt Recognit 29(1):51–59. doi:10.1016/0031-3203(95)00067-4 CrossRefGoogle Scholar
  54. 54.
    Oudeyer PY (2003) The production and recognition of emotions in speech: features and algorithms. Int J Hum-Comput Stud 59(1–2):157–183. doi:10.1016/S1071-5819(02)00141-6 Google Scholar
  55. 55.
    Picard RW (1995) Affective Computing. M.I.T media laboratory perceptual computing section, Technical Report No 321Google Scholar
  56. 56.
    Picard RW, Vyzas E, Healey J (2001) Toward machine emotional intelligence: analysis of affective physiological state. IEEE Trans Patt Anal Mach Intell 23(10):1175–1191. doi:10.1109/34.954607 CrossRefGoogle Scholar
  57. 57.
    Plutchik R, Kellerman H (1980) Theories of emotion. Emotion: theory, research, and experience, Academic press INC, USAGoogle Scholar
  58. 58.
    Rabiner L (1989) A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2):257–286. doi:10.1109/5.18626 CrossRefGoogle Scholar
  59. 59.
    Rabiner L, Juang BH (1993) Fundamentals of speech recognition. Prentice Hall PTR, New JerseyGoogle Scholar
  60. 60.
    Raybourn EM (2014) A new paradigm for serious games: transmedia learning for more effective training and education. J Comput Sci 5(3):471–481CrossRefGoogle Scholar
  61. 61.
    Russell JA, Mehrabian A (1977) Evidence for a three-factor theory of emotions. J Res Pers 11(3):273–294CrossRefGoogle Scholar
  62. 62.
    Schaub FM (2014) Dynamic privacy adaptation in ubiquitous computing. Dissertation, Universität Ulm. Fakultät für Ingenieurwissenschaften und Informatik. http://vts.uni-ulm.de/doc.asp?id=9029
  63. 63.
    Scherer K (2005) What are emotions? and how can they be measured? Soc Sci Inf 44(4):695–729. doi:10.1177/0539018405058216 CrossRefGoogle Scholar
  64. 64.
    Scherer K, Shuman V, Fontaine J, Soriano C (2013) The grid meets the wheel: assessing emotional feeling via self-report. In: Fontaine J, Scherer K, Soriano C (eds) Components of emotional meaning : a sourcebook. Oxford University Press, Series in affective science, pp 281–298CrossRefGoogle Scholar
  65. 65.
    Scherer S, Schwenker F, Palm G (2009) Classifier fusion for emotion recognition from speech. In: Advanced intelligent environments, Springer, Berlin, pp 95–117Google Scholar
  66. 66.
    Scherer S, Kane J, Gobl C, Schwenker F (2013) Investigating fuzzy-input fuzzy-output support vector machines for robust voice quality classification. Comput Speech Lang 27(1):263–287, doi:10.1016/j.csl.2012.06.001 (Special issue on paralinguistics in naturalistic speech and language)
  67. 67.
    Schmidt M, Schels M, Schwenker F (2010) A hidden markov model based approach for facial expression recognition in image sequences. In: Proceedings of the 4th IAPR TC3 workshop on artificial neural networks in pattern recognition (ANNPR’10), LNAI 5998, pp 149–160. doi:10.1007/978-3-642-12159-3_14
  68. 68.
    Schüssel F, Honold F, Weber M, Schmidt M, Bubalo N, Huckauf A (2014) Multimodal interaction history and its use in error detection and recovery. In: Proceedings of the 16th ACM international conference on multimodal interaction, ICMI ’14, ACM, New York, pp 164–171. doi:10.1145/2663204.2663255
  69. 69.
    Schwenker F, Scherer S, Schmidt M, Schels M, Glodek M (2010) Multiple classifier systems for the recogonition of human emotions. In: El Gayar N, Kittler J, Roli F (eds) Multiple classifier systems, Lecture notes in computer science, vol 5997, Springer, Berlin, pp 315–324. doi:10.1007/978-3-642-12127-2_33
  70. 70.
    Shafer G, et al. (1976) A mathematical theory of evidence, vol 1. Princeton University Press, USAGoogle Scholar
  71. 71.
    Shneiderman B (2007) Foreword. In: Sears A, Jacko JA (eds) The human-computer interaction handbook: fundamentals, evolving technologies and emerging applications, Second edition (Human factors and ergonomics), 2nd edn, CRC Press, pp XIX-XXGoogle Scholar
  72. 72.
    Stuhlsatz A, Meyer C, Eyben F, ZieIke T, Meier G, Schuller B (2011) Deep neural networks for acoustic emotion recognition: raising the benchmarks. In: 2011 IEEE International Conference on Acoustics, speech and signal processing (ICASSP), IEEE, pp 5688–5691Google Scholar
  73. 73.
    Sun D, Roth S, Black M (2010) Secrets of optical flow estimation and their principles. In: 2010 IEEE conference on Computer vision and pattern recognition (CVPR), pp 2432–2439. doi:10.1109/CVPR.2010.5539939
  74. 74.
    Thiel C, Scherer S, Schwenker F (2007) Fuzzy-input fuzzy-output one-against-all support vector machines. In: Knowledge-based intelligent information and engineering systems, Springer, Berlin, pp 156–165Google Scholar
  75. 75.
    Valstar M, Schuller B, Smith K, Eyben F, Jiang B, Bilakhia S, Schnieder S, Cowie R, Pantic M (2013) Avec 2013: The continuous audio/visual emotion and depression recognition challenge. In: Proceedings of the 3rd ACM international workshop on audio/visual emotion challenge, AVEC ’13, ACM, New York, pp 3–10. doi:10.1145/2512530.2512533
  76. 76.
    Vogt T, André E (2005) Comparing feature sets for acted and spontaneous speech in view of automatic emotion recognition. In: IEEE international conference on multimedia and expo, 2005. ICME 2005, pp 474–477. doi:10.1109/ICME.2005.1521463
  77. 77.
    Walter S, Scherer S, Schels M, Glodek M, Hrabal D, Schmidt M, Böck R, Limbrecht K, Traue HC, Schwenker F (2011) Multimodal emotion classification in naturalistic user behavior. In: Jacko JA (ed) Proceedings of the 14th international conference on human computer interaction (HCI’11), Springer, LNCS 6763, pp 603–611Google Scholar
  78. 78.
    Walter S, Kim J, Hrabal D, Crawcour S, Kessler H, Traue H (2013) Transsituational individual-specific biopsychological classification of emotions. IEEE Trans Syst Man Cybern 43(4):988–995CrossRefGoogle Scholar
  79. 79.
    Weiser M (1999) The computer for the 21st century. SIGMOBILE Mob Comput Commun Rev (This article first appeared in Scientific America) 3(3):3–11, 1991. doi:10.1145/329124.329126, vol 265, no 3 (September 1991), pp 94–104
  80. 80.
    Wendemuth A, Biundo S (2012) A companion technology for cognitive technical systems. In: Esposito A, Esposito AM, Vinciarelli A, Hoffmann R, Müller VC (eds) Cognitive behavioural systems, LNCS, vol 7403, Springer, Berlin, pp 89–103. doi:10.1007/978-3-642-34584-5_7
  81. 81.
    Wundt W (1896) Grundriss der psychologie. Engelmann, LeipzigGoogle Scholar
  82. 82.
    Yang J, Yan R, Hauptmann AG (2007) Cross-domain video concept detection using adaptive svms. In: Proceedings of the 15th international conference on multimedia, ACM, pp 188–197Google Scholar
  83. 83.
    Zhao G, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Patt Anal Mach Intell, 29(6):915–928. doi:10.1109/TPAMI.2007.1110

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Sascha Meudt
    • 1
  • Miriam Schmidt-Wack
    • 1
  • Frank Honold
    • 2
  • Felix Schüssel
    • 2
  • Michael Weber
    • 2
  • Friedhelm Schwenker
    • 1
  • Günther Palm
    • 1
  1. 1.Institute of Neural Information ProcessingUlm UniversityUlmGermany
  2. 2.Institute of Media InformaticsUlm UniversityUlmGermany

Personalised recommendations