The Influence of Context Knowledge for Multi-modal Affective Annotation

  • Ingo Siegert
  • Ronald Böck
  • Andreas Wendemuth
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8008)


To provide successful human-computer interaction, automatic emotion recognition from speech experienced greater attention, also increasing the demand for valid data material. Additionally, the difficulty to find appropriate labels is increasing.

Therefore, labels, which are manageable by evaluators and cover nearly all occurring emotions, have to be found. An important question is how context influences the annotators’ decisions. In this paper, we present our investigations of emotional affective labelling on natural multi-modal data investigating different contextual aspects. We will explore different types of contextual information and their influence on the annotation process.

In this paper we investigate two specific contextual factors, observable channels and knowledge about the interaction course. We discover, that the knowledge about the previous interaction course is needed to assess the affective state, but that the presence of acoustic and video channel can partially replace the lack of discourse knowledge.


emotion comparison affective state labelling context influence 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Batliner, A., Seppi, D., Steidl, S., Schuller, B.: Segmenting into adequate units for automatic recognition of emotion-related episodes: A speech-based approach. In: Advances in Human-Computer Interaction 2010 (2010)Google Scholar
  2. 2.
    Böck, R., Siegert, I., Vlasenko, B., Wendemuth, A., Haase, M., Lange, J.: A processing tool for emotionally coloured speech. In: Proc. of the 2011 IEEE International Conference on Multimedia & Expo., Barcelona, Spain (July 11-15, 2011)Google Scholar
  3. 3.
    Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A database of german emotional speech. In: Proc. of Interspeech (2005)Google Scholar
  4. 4.
    Callejas, Z., López-Cózar, R.: Influence of contextual information in emotion annotation for spoken dialogue systems. Speech Com. 50, 416–433 (2008)CrossRefGoogle Scholar
  5. 5.
    Cauldwell, R.T.: Where did the anger go? the role of context in interpreting emotion in speech. In: Proc. of the ISCA Workshop on Speech and Emotion, Newcastle, Northern Ireland, UK, pp. 127–131 (September 2000)Google Scholar
  6. 6.
    Cowie, R., Cornelius, R.R.: Describing the emotional states that are expressed in speech. Speech Commun. 40(1-2), 5–32 (2003)zbMATHCrossRefGoogle Scholar
  7. 7.
    Douglas-Cowie, E., Campbell, N., Cowie, R., Roach, P.: Emotional speech: towards a new generation of databases. Speech Com. Special Issue Speech and Emotion 40, 33–60 (2003)zbMATHCrossRefGoogle Scholar
  8. 8.
    Douglas-Cowie, E., et al.: The HUMAINE database: Addressing the collection and annotation of naturalistic and induced emotional data. In: Paiva, A.C.R., Prada, R., Picard, R.W. (eds.) ACII 2007. LNCS, vol. 4738, pp. 488–500. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  9. 9.
    Douglas-Cowie, E., Devillers, L., Martin, J.C., Cowie, R., Savvidou, S., Abrilian, S., Cox, C.: Multimodal databases of everyday emotion: facing up to complexity. In: European Conference on Speech Com. and Technology, pp. 813–816 (2005)Google Scholar
  10. 10.
    Frommer, J., Michaelis, B., Rösner, D., Wendemuth, A., Friesen, R., Haase, M., Kunze, M., Andrich, R., Lange, J., Panning, A., Siegert, I.: Towards Emotion and Affect Detection in the Multimodal LAST MINUTE Corpus. In: Proc. of the Eight International Conference on Language Resources and Evaluation (LREC 2012), ELRA, Istanbul, Turkey (May 2012)Google Scholar
  11. 11.
    Gnjatović, M., Rösner, D.: On the role of the NIMITEK corpus in developing an emotion adaptive spoken dialogue system. In: Proc. of the Language Resources and Evaluation Conference (LREC 2008), Marrakech, Morocco (2008)Google Scholar
  12. 12.
    Krell, G., Glodek, M., Panning, A., Siegert, I., Michaelis, B., Wendemuth, A., Schwenker, F.: Fusion of Fragmentary Classifier Decisions for Affective State Recognition. In: Schwenker, F., Scherer, S., Morency, L.-P. (eds.) MPRSS 2012. LNCS, vol. 7742, pp. 116–130. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  13. 13.
    Lefter, I., Rothkrantz, L.J.M., Burghouts, G.J.: Aggression detection in speech using sensor and semantic information. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2012. LNCS, vol. 7499, pp. 665–672. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  14. 14.
    Panning, A., Siegert, I., Al-Hamadi, A., Wendemuth, A., Rösner, D., Frommer, J., Krell, G., Michaelis, B.: Multimodal affect recognition in spontaneous hci environment. In: IEEE International Conference on Signal Processing, Communications and Computings (ICSPCC), pp. 430–435 (2012)Google Scholar
  15. 15.
    Rösner, D., Friesen, R., Otto, M., Lange, J., Haase, M., Frommer, J.: Intentionality in interacting with companion systems – an empirical approach. In: Jacko, J.A. (ed.) Human-Computer Interaction, Part III, HCII 2011. LNCS, vol. 6763, pp. 593–602. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  16. 16.
    Siegert, I., Böck, R., Philippou-Hübner, D., Vlasenko, B., Wendemuth, A.: Appropriate Emotional Labeling of Non-acted Speech Using Basic Emotions, Geneva Emotion Wheel and Self Assessment Manikins. In: Proc. of the IEEE International Conference on Multimedia and Expo., ICME 2011, Barcelona, Spain (2011)Google Scholar
  17. 17.
    Siegert, I., Böck, R., Wendemuth, A.: The influence of context knowledge for multimodal annotation on natural material. In: Proc. of the First Workshop on Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction (MA3), Santa Cruz, USA (September 2012)Google Scholar
  18. 18.
    Vaughan, B., Kosidis, S., Cullen, C., Wang, Y.: Task-based mood induction procedures for the elicitation of natural emotional responses. In: The 4th International Conference on Cybernetics and Information Technologies, Systems and Applications, Orlando, Florida (2007)Google Scholar
  19. 19.
    Vidrascu, L., Devillers, L.: Real-life emotion representation and detection in call centers data. In: Tao, J., Tan, T., Picard, R.W. (eds.) ACII 2005. LNCS, vol. 3784, pp. 739–746. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  20. 20.
    Wendemuth, A., Biundo, S.: A Companion Technology for Cognitive Technical Systems. In: Esposito, A., Esposito, A.M., Vinciarelli, A., Hoffmann, R., Müller, V.C. (eds.) COST 2102. LNCS, vol. 7403, pp. 89–103. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  21. 21.
    Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions. IEEE Trans. on Pattern Analysis and Machine Intelligence 31, 39–58 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Ingo Siegert
    • 1
  • Ronald Böck
    • 1
  • Andreas Wendemuth
    • 1
  1. 1.Cognitive Systems GroupOtto von Guericke UniversityMagdeburgGermany

Personalised recommendations