Advertisement

Study on Speaker-Independent Emotion Recognition from Speech on Real-World Data

  • Theodoros Kostoulas
  • Todor Ganchev
  • Nikos Fakotakis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5042)

Abstract

In the present work we report results from on-going research activity in the area of speaker-independent emotion recognition. Experimentations are performed towards examining the behavior of a detector of negative emotional states over non-acted/acted speech. Furthermore, a score-level fusion of two classifiers on utterance level is applied, in attempt to improve the performance of the emotion recognizer. Experimental results demonstrate significant differences on recognizing emotions on acted/real-world speech.

Keywords

Emotion recognition real-life data 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Devillers, L., Vidrascu, L.: Real life emotions detection with lexical and paralinguistic cues on human-human call center dialogs. In: Proc. of the Interspeech 2006, pp. 801–804 (2006)Google Scholar
  2. 2.
    Schuller, B., Müller, R., Lang, M., Rigoll, G.: Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles. In: Proc. of the Interspeech 2005, pp. 805–808 (2005)Google Scholar
  3. 3.
    Lugger, M., Yang, B.: The relevance of voice quality features in speaker independent emotion recognition. In: Proc. of the ICASSP 2007, vol. IV, pp. 17–20 (2007)Google Scholar
  4. 4.
    Lee, C.M., Narayanan, S.S.: Towards detecting emotions in spoken dialogs. IEEE Transactions on Speech and Audio Processing 13(2), 293–303 (2005)CrossRefGoogle Scholar
  5. 5.
    Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proc. of the 17 IFA 1993, pp. 97–110 (1993)Google Scholar
  6. 6.
    Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on ASSP 28, 357–366 (1980)CrossRefGoogle Scholar
  7. 7.
    Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Processing 3, 72–83 (1995)CrossRefGoogle Scholar
  8. 8.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. 39, 1–38 (1977)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Wilting, J., Kramber, E., Swerts, M.: Real vs. acted emotional speech. In: Proc. of the Interspeech 2006, pp. 805–808 (2006)Google Scholar
  10. 10.
    University of Pennsylvania, Linguistic Data Consortium, Emotional Prosody Speech (2002), http://www.ldc.uppen.edu/Catalog/CatalogEntry.jsp?cataloId=LDC2002S28
  11. 11.
    Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A Database of German Emotional Speech. In: Proc. of Interspeech 2005, pp. 1517–1520 (2005)Google Scholar
  12. 12.
    Artificial Intelligence Group, Wire Communication Laboratory, University of Patras, http://www.wcl.ee.upatras.gr/ai/Research/SEmo.htm
  13. 13.
    Kostoulas, T., Ganchev, T., Mporas, I., Fakotakis, N.: A real-world emotional speech corpus for modern Greek. In: Proc. of LREC 2008, Morocco (May 2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Theodoros Kostoulas
    • 1
  • Todor Ganchev
    • 1
  • Nikos Fakotakis
    • 1
  1. 1.Artificial Intelligence Group, Wire Communications Laboratory, Electrical and Computer Engineering DepartmentUniversity of PatrasRion-PatrasGreece

Personalised recommendations