ikannotate – A Tool for Labelling, Transcription, and Annotation of Emotionally Coloured Speech

  • Ronald Böck
  • Ingo Siegert
  • Matthias Haase
  • Julia Lange
  • Andreas Wendemuth
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6974)


In speech recognition and emotion recognition from speech, qualitatively high transcription and annotation of given material is important. To analyse prosodic features, linguistics provides several transcription systems. Furthermore, in emotion labelling different methods are proposed and discussed. In this paper, we introduce the tool ikannotate, which combines prosodic information with emotion labelling. It allows the generation of a transcription of material directly annotated with prosodic features. Moreover, material can be emotionally labelled according to Basic Emotions, the Geneva Emotion Wheel, and Self Assessment Manikins. Finally, we present results of two usability tests observing the ability to identify emotions in labelling and comparing the transcription tool “Folker” with our application.


Man-Machine-Interaction Tool Labelling Transcription Annotation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bradley, M., Lang, P.: Measuring emotion: The self-assessment manikin and the semantic differential. Journal of Behavior Therapy and Experimental Psychiatry 25, 49–59 (1994)CrossRefGoogle Scholar
  2. 2.
    Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A Database of German Emotional Speech. In: Proc. of the 5th Interspeech (2005)Google Scholar
  3. 3.
    Douglas-Cowie, E., Campbell, N., Cowie, R., Roach, P.: Emotional speech: towards a new generation of databases. Speech Communication Special Issue Speech and Emotion 40, 33–60 (2003)MATHGoogle Scholar
  4. 4.
    Ehlich, K., Rehbein, J.: Erweiterte halbinterpretative Arbeitstranskriptionen (HIAT2): Intonation. Linguistische Berichte 59, 51–75 (1979)Google Scholar
  5. 5.
    Ekman, P.: Are there basic emotions? Psychological Review 99, 550–553 (1992)CrossRefGoogle Scholar
  6. 6.
    Gnjatović, M., Rösner, D.: The NIMITEK Corpus of Affected Behavior in Human-Machine Interaction. In: Proc. of the Second International Workshop on Corpora for Research on Emotion and Affect (satellite of LREC 2008), Marrakech, Marocco (2008)Google Scholar
  7. 7.
    Grimm, M., Kroschel, K., Mower, E., Narayanan, S.: Primitives-based evaluation and estimation of emotions in speech. Speech Communication 49(10-11), 787–800 (2007)CrossRefGoogle Scholar
  8. 8.
    MacWhinney, B.: The CHILDES Project: Tools for Analyzing Talk, 3rd edn. Lawrence Erlbaum Associates, Mahwah (2000)Google Scholar
  9. 9.
    Mehrabian, A.: Pleasure-arousal-dominance: A general framework for describing and measuring individual differences in Temperament. Current Psychology 14(4), 261–292 (1996)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Paul, D., Baker, J.: The Design for the Wall Street Journal-based CSR Corpus. In: Proc. of the Workshop on Speech and Natural Language, Stroudsburg, PA, USA, pp. 357–362 (1992)Google Scholar
  11. 11.
    Picard, R.W.: Affective Computing. MIT Press, Cambridge, MA (2000)Google Scholar
  12. 12.
    Rehbein, J.: Remarks on the empirical analysis of action and speech: The case of question sequences in classroom discourse. Journal of Pragmatics 8(1), 49–63 (1984)CrossRefGoogle Scholar
  13. 13.
    Scherer, K.: What are emotions? And how can they be measured? Social Science Information 44(4), 695–729 (2005)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Scherer, S., Siegert, I., Bigalke, L., Meudt, S.: Developing an Expressive Speech Labeling Tool Incorporating the Temporal Characteristics of Emotion. In: Proc. of the 7th International Conference on Language Resources and Evaluation (2010)Google Scholar
  15. 15.
    Schmidt, T., Schütte, W.: FOLKER: An Annotation Tool For Efficient Transcription of Natural, Multi-Party Interaction. In: Proc. of the 7th International Conference on Language Resources and Evaluation (2010)Google Scholar
  16. 16.
    Selting, M., Auer, P., Barden, B., Bergmann, J., Couper-Kuhlen, E., Günthner, S., Meier, C., Quasthoff, U., Schlobinski, P., Uhmann, S.: Gesprächsanalytisches Transkriptionssystem (GAT). Linguistische Berichte 173, 91–122 (1998)Google Scholar
  17. 17.
    Selting, M., Auer, P., Barth-Weingarten, D., Bergmann, J., Bergmann, P., Birkner, K., Couper-Kuhlen, E., Deppermann, A., Gilles, P., Günthner, S., Hartung, M., Kern, F., Mertzlufft, C., Meyer, C., Morek, M., Oberzaucher, F., Peters, J., Quasthoff, U., Schütte, W., Stukenbrock, A., Uhmann, S.: A system for transcribing talk-in-interaction: GAT 2. To appear in: Gesprächsforschung Online-Zeitschrift zur verbalen Interaktion 12 (2011)Google Scholar
  18. 18.
    Siegert, I., Böck, R., Philippou-Hübner, D., Vlasenko, B., Wendemuth, A.: Appropriate Emotional Labeling of Non-Acted Speech Using Basic Emotions, Geneva Emotion Wheel and Self Assessment Manikins. In: Proc. IEEE Int. Conf. on Multimedia & Expo, Barcelona (2011)Google Scholar
  19. 19.
    Strauss, P.M., Hoffmann, H., Minker, W., Neumann, H., Palm, G., Scherer, S., Traue, H., Weidenbacher, U.: The PIT corpus of German multi-party dialogues. In: Proc. of the Sixth International Language Resources and Evaluation (LREC 2008), Marrakech, Morocco (2008)Google Scholar
  20. 20.
    Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book, version 3.4. Cambridge University Engineering Department (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Ronald Böck
    • 1
  • Ingo Siegert
    • 1
  • Matthias Haase
    • 2
  • Julia Lange
    • 2
  • Andreas Wendemuth
    • 1
  1. 1.Dept. of Electrical Engineering and Information TechnologyOtto von Guericke UniversityMagdeburgGermany
  2. 2.Dept. of Psychosomatic Medicine and PsychotherapyOtto von Guericke UniversityMagdeburgGermany

Personalised recommendations