Skip to main content

Investigating Glottal Parameters and Teager Energy Operators in Emotion Recognition

  • Conference paper
Affective Computing and Intelligent Interaction (ACII 2011)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6975))

Abstract

The purpose of this paper is to study the performance of glottal waveform parameters and TEO in distinguishing binary classes of four emotion dimensions (activation, expectation, power, and valence) using authentic emotional speech. The two feature sets were compared with a 1941-dimension acoustic feature set including prosodic, spectral, and other voicing related features extracted using openSMILE toolkit. The comparison work highlight the discrimination ability of TEO in emotion dimensions activation and power, and glottal parameters in expectation and valence for authentic speech data. Using the same classification methodology, TEO and glottal parameter outperformed or performed similarly to the prosodic, spectral and other voicing related features (i.e., the feature set obtained using openSMILE).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.G.: Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine 18, 32–80 (2001)

    Article  Google Scholar 

  2. Tao, J., Tan, T.: Affective computing: A review. In: Tao, J., Tan, T., Picard, R.W. (eds.) ACII 2005. LNCS, vol. 3784, pp. 981–995. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Calvo, R.A., D’Mello, S.: Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Transactions on Affective Computing 1, 18–37 (2010)

    Article  Google Scholar 

  4. Busso, C., Sungbok, L., Narayanan, S.: Analysis of emotionally salient aspects of fundamental frequency for emotion detection. IEEE Transactions on Audio, Speech, and Language Processing 17, 582–596 (2009)

    Article  Google Scholar 

  5. Litman, D.J., Forbes-Riley, K.: Recognizing student emotions and attitudes on the basis of utterances in spoken tutoring dialogues with both human and computer tutors. Speech Communication 48, 559–590 (2006)

    Article  Google Scholar 

  6. Fragopanagos, N., Taylor, J.G.: Emotion recognition in human-computer interaction. Neural Networks 18, 389–405 (2005)

    Article  Google Scholar 

  7. Nicolaou, M., Gunes, H., Pantic, M.: Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space. IEEE Transactions on Affective Computing, 1–1 (2011)

    Google Scholar 

  8. Espinosa, H.P., Garcia, C.A.R., Pineda, L.V.: Bilingual acoustic feature selection for emotion estimation using a 3d continuous model. In: 2011 IEEE International Conference on Automatic Face and Gesture Recognition and Workshops, pp. 786–791 (2011)

    Google Scholar 

  9. Wu, C.H., Liang, W.B.: Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels. IEEE Transactions on Affective Computing 2, 10–21 (2011)

    Article  Google Scholar 

  10. Cummings, K.E., Clements, M.A.: Analysis of the glottal excitation of emotionally styled and stressed speech. The Journal of the Acoustical Society of America 98, 88–98 (1995)

    Article  Google Scholar 

  11. Moore, E., Clements, M., Peifer, J., Weisser, L.: Investigating the role of glottal features in classifying clinical depression. In: Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol. 3, pp. 2849–2852 (2003)

    Google Scholar 

  12. Ozdas, A., Shiavi, R.G., Silverman, S.E., Silverman, M.K., Wilkes, D.M.: Investigation of vocal jitter and glottal flow spectrum as possible cues for depression and near-term suicidal risk. IEEE Transactions on Biomedical Engineering 51, 1530–1540 (2004)

    Article  Google Scholar 

  13. Moore, E., Clements, M.A., Peifer, J.W., Weisser, L.: Critical analysis of the impact of glottal features in the classification of clinical depression in speech. IEEE Transactions on Biomedical Engineering 55, 96–107 (2008)

    Article  Google Scholar 

  14. Moore, E., Torres, J.: A performance assessment of objective measures for evaluating the quality of glottal waveform estimates. Speech Communication (2007) (in press)

    Google Scholar 

  15. Sun, R., Moore, E., Torres, J.: Investigating glottal parameters for differentiating emotional categories with similar prosodics. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009, Taipei, Taiwan (2009)

    Google Scholar 

  16. Caims, D., Hansen, J.H.L.: Nonlinear analysis and classification of speech under stressed conditions. The Journal of the Acoustical Society of America 96, 3392–3399 (1994)

    Article  Google Scholar 

  17. Zhou, G., Hansen, J.H.L., Kaiser, J.F.: Nonlinear feature based classification of speech under stress. IEEE Transactions on Speech and Audio Processing 9, 201–216 (2001)

    Article  Google Scholar 

  18. Eyben, F., Wollmer, M., Schuller, B.: Openear - introducing the munich open-source emotion and affect recognition toolkit. In: 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII 2009, pp. 1–6 (2009)

    Google Scholar 

  19. Eyben, F., Wollmer, M., Schuller, B.: Opensmile-the munich versatile and fast open-source audio feature extractor. In: ACM Multimedia (MM), Florence, Italy, pp. 1459–1462 (2010)

    Google Scholar 

  20. Sundberg, J., Patel, S., Bjorkner, E., Scherer, K.: Interdependencies among voice source parameters in emotional speech. IEEE Transactions on Affective Computing, 1–1 (2011)

    Google Scholar 

  21. McKeown, G., Valstar, M.F., Cowie, R., Pantic, M.: The semaine corpus of emotionally coloured character interactions. In: 2010 IEEE International Conference on Multimedia and Expo. (ICME), pp. 1079–1084 (2010)

    Google Scholar 

  22. Schuller, B., Valstar, M.F., Eyben, F., McKeown, G., Cowie, R., Pantic, M.: Avec 2011-the first international audio/visual emotion chanllenge. In: D´Mello, S., et al. (eds.) ACII 2011, Part II. LNCS, vol. 6975, pp. 415–424. Springer, Heidelberg (2011)

    Google Scholar 

  23. Patrick, A.N., Anastasis, K., Jon, G., Mike, B.: Estimation of glottal closure instants in voiced speech using the dypsa algorithm. IEEE Transactions on Audio, Speech, and Language Processing 15, 34–43 (2007)

    Article  Google Scholar 

  24. Airas, M., Pulakka, H., Backstrom, T., Alku, P.: A toolkit for voice inverse filtering and parametrisation. In: INTERSPEECH (2005)

    Google Scholar 

  25. Laukkanen, A.M., Vilkman, E., Alku, P., Oksanen, H.: Physical variations related to stress and emotional state: a preliminary study. Journal of Phonetics 24, 313–335 (1996)

    Article  Google Scholar 

  26. Titze, I.R., Sundberg, J.: Vocal intensity in speakers and singers. The Journal of the Acoustical Society of America 91, 2936–2946 (1992)

    Article  Google Scholar 

  27. Childers, D.G.: Vocal quality factors: Analysis, synthesis, and perception. The Journal of the Acoustical Society of America 90, 2394–2410 (1991)

    Article  Google Scholar 

  28. Kaiser, J.F.: On a simple algorithm to calculate the ‘energy’ of a signal. In: International Conference on Acoustics, Speech, and Signal Processing, ICASSP-1990, vol.1, pp: 381–384 (1990)

    Google Scholar 

  29. Maragos, P., Kaiser, J.F., Quatieri, T.F.: Energy separation in signal modulations with application to speech analysis. IEEE Transactions on Signal Processing 41, 3024–3051 (1993)

    Article  MATH  Google Scholar 

  30. Hanson, H.M., Maragos, P., Potamianos, A.: A system for finding speech formants and modulations via energy separation. IEEE Transactions on Speech and Audio Processing 2, 436–443 (1994)

    Article  Google Scholar 

  31. Potamianos, A., Maragos, P.: Speech formant frequency and bandwidth tracking using multiband energy demodulation. In: International Conference on Acoustics, Speech, and Signal Processing, ICASSP-1995, vol. 1, pp. 784–787 (1995)

    Google Scholar 

  32. Lippmann, R., Martin, E., Paul, D.: Multi-style training for robust isolated-word speech recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1987, vol. 12, pp. 705–708 (1987)

    Google Scholar 

  33. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. SIGKDD Explorations 11 (2009)

    Google Scholar 

  34. Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011) Software, http://www.csie.ntu.edu.tw/~cjlin/libsvm

    Article  Google Scholar 

  35. Batliner, A., Steidl, S., Schuller, B., Seppi, D., Laskowski, K., Vogt, T., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., Aharonson, V.: Combining efforts for improving automatic classification of emotional user states. In: Proc. IS-LTC 2006, Ljubliana, pp. 240–245 (2006)

    Google Scholar 

  36. Hirschberg, J., Liscombe, J., Venditti, J.: Experiments in emotional speech. In: ISCA and IEEE Workshop on Spontanous Speech Processing and Recognition, Tokyo, Japan, pp. 119–125 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sun, R., Moore, E. (2011). Investigating Glottal Parameters and Teager Energy Operators in Emotion Recognition. In: D’Mello, S., Graesser, A., Schuller, B., Martin, JC. (eds) Affective Computing and Intelligent Interaction. ACII 2011. Lecture Notes in Computer Science, vol 6975. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24571-8_54

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24571-8_54

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24570-1

  • Online ISBN: 978-3-642-24571-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics