Multiple Feature Extraction and Hierarchical Classifiers for Emotions Recognition

Albornoz, Enrique M.; Milone, Diego H.; Rufiner, Hugo L.

doi:10.1007/978-3-642-12397-9_20

Enrique M. Albornoz^20,21,
Diego H. Milone^20,21 &
Hugo L. Rufiner^20,21,22

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5967))

2339 Accesses
1 Citations

Abstract

The recognition of the emotional states of speaker is a multi-disciplinary research area that has received great interest in the last years. One of the most important goals is to improve the voiced-based human-machine interactions. Recent works on this domain use the proso-dic features and the spectrum characteristics of speech signal, with standard classifier methods. Furthermore, for traditional methods the improvement in performance has also found a limit. In this paper, the spectral characteristics of emotional signals are used in order to group emotions. Standard classifiers based on Gaussian Mixture Models, Hidden Markov Models and Multilayer Perceptron are tested. These classifiers have been evaluated in different configurations with different features, in order to design a new hierarchical method for emotions classification. The proposed multiple feature hierarchical method improves the performance in 6.35% over the standard classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Clavel, C., Vasilescu, I., Devillers, L., Richard, G., Ehrette, T.: Fear-type emotion recognition for future audio-based surveillance systems. Speech Commun. 50(6), 487–503 (2008)
Article Google Scholar
Devillers, L., Vidrascu, L.: Real-Life Emotion Recognition in Speech. In: Müller, C. (ed.) Speaker Classifcation II. LNCS (LNAI), vol. 4441, pp. 34–42. Springer, Heidelberg (2007)
Chapter Google Scholar
Tacconi, D., Mayora, O., Lukowicz, P., Arnrich, B., Setz, C., Troster, G., Haring, C.: Activity and emotion recognition to support early diagnosis of psychiatric diseases. In: Pervasive Computing Technologies for Healthcare, 2008. Second International Conference on Pervasive Health 2008, February 2008, pp. 100–102 (2008)
Google Scholar
Kim, J., André, E.: Emotion recognition based on physiological changes in music listening. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(12), 2067–2083 (2008)
Article Google Scholar
Schindler, K., Gool, L.V., de Gelder, B.: Recognizing emotions expressed by body pose: A biologically inspired neural model. Neural Networks 21(9), 1238–1246 (2008)
Article Google Scholar
Vinhas, V., Reis, L.P., Oliveira, E.: Dynamic Multimedia Content Delivery Based on Real-Time User Emotions. Multichannel Online Biosignals Towards Adaptative GUI and Content Delivery. In: BIOSIGNALS 2009 - International Conf. on Bio-inspired Systems and Signal Processing, Porto (Portugal), pp. 299–304 (2009)
Google Scholar
Albornoz, E.M., Crolla, M.B., Milone, D.H.: Recognition of emotions in speech. In: Proceedings of XXXIV CLEI, Santa Fe Argentina, September 2008, pp. 1120–1129 (2008)
Google Scholar
Borchert, M., Dusterhoft, A.: Emotions in speech - experiments with prosody and quality features in speech for use in categorical and dimensional emotion recognition environments. In: Proceedings of IEEE International Conference on Natural Language Processing and Knowledge Engineering, IEEE NLP-KE 2005, October 2005, pp. 147–151 (2005)
Google Scholar
El Ayadi, M., Kamel, M., Karray, F.: Speech Emotion Recognition using Gaussian Mixture Vector Autoregressive Models. In: IEEE International Conference on Acoustics, Speech and Signal Processing. ICASSP 2007, April 2007, vol. 4, pp. 957–960 (2007)
Google Scholar
Rong, J., Chen, Y.P., Chowdhury, M., Li, G.: Acoustic Features Extraction for Emotion Recognition. In: 6th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2007, July 2007, pp. 419–424 (2007)
Google Scholar
Morrison, D., Wang, R., Silva, L.C.D.: Ensemble methods for spoken emotion recognition in call-centres. Speech Communication 49(2), 98–112 (2007)
Article Google Scholar
Schuller, B., Rigoll, G., Lang, M.: Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (Proceedings ICASSP 2004), May 2004, vol. 1, pp. I–577–I–580 (2004)
Google Scholar
Fu, L., Mao, X., Chen, L.: Speaker independent emotion recognition based on SVM/HMMs fusion system. In: International Conf. on Audio, Language and Image Processing, ICALIP 2008, July 2008, pp. 61–65 (2008)
Google Scholar
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A Database of German Emotional Speech. In: Proc. Interspeech 2005, September 2005, pp. 1517–1520 (2005)
Google Scholar
Schuller, B., Vlasenko, B., Arsic, D., Rigoll, G., Wendemuth, A.: Combining speech recognition and acoustic word emotion models for robust text-independent emotion recognition. In: IEEE International Conference on Multimedia and Expo, April 2008, pp. 1333–1336 (2008)
Google Scholar
Cowie, R., Cornelius, R.: Describing the emotional states that are expressed in speech. Speech Communication 40(1), 5–32 (2003)
Article MATH Google Scholar
Kim, J.: Bimodal Emotion Recognition using Speech and Physiological Changes. In: Robust Speech Recognition and Understanding, pp. 265–280. I-Tech Education and Publishing, Vienna (2007)
Google Scholar
Scherer, K.R.: What are emotions? And how can they be measured? Social Science Information 44(4), 695–729 (2005)
Article MathSciNet Google Scholar
Noguerias, A., Moreno, A., Bonafonte, A., Mariño, J.: Speech Emotion Recognition Using Hidden Markov Models. In: Eurospeech 2001, pp. 2679–2682 (2001)
Google Scholar
Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book (for HTK Version 3.1). Cambridge University Engineering Department, England (2001)
Google Scholar
Deller, J.R., Proakis, J.G., Hansen, J.H.: Discrete-Time Processing of Speech Signals. Macmillan Publishing, New York (1993)
Google Scholar
Adell Mercado, J., Bonafonte Cávez, A., Escudero Mancebo, D.: Analysis of prosodic features: towards modelling of emotional and pragmatic attributes of speech. In: Procesamiento del lenguaje natural, September 2005, vol. (35), pp. 277–283 (2005)
Google Scholar
Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice-Hall, Englewood Cliffs (1998)
MATH Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning, 1st edn. Springer, Heidelberg (2006)
MATH Google Scholar
Rabiner, L.R., Juang, B.H.: Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs (1993)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Centro de I+D en Señales, Sistemas e INteligencia Computacional (SINC(i)) Facultad de Ingeniería y Ciencias Hídricas, Universidad Nacional del Litoral, Ciudad Universitaria, Paraje El Pozo, S3000, Santa Fe, Argentina
Enrique M. Albornoz, Diego H. Milone & Hugo L. Rufiner
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Argentina
Enrique M. Albornoz, Diego H. Milone & Hugo L. Rufiner
Laboratorio de Cibernética, Fac. de Ingeniería, Universidad Nacional de Entre Ríos, Argentina
Hugo L. Rufiner

Authors

Enrique M. Albornoz
View author publications
You can also search for this author in PubMed Google Scholar
Diego H. Milone
View author publications
You can also search for this author in PubMed Google Scholar
Hugo L. Rufiner
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Second University of Naples, and IIASS, Via Pellegrino, 84019, Vietri sul Mare, SA, Italy
Anna Esposito
Centre for Language and Communication Studies, Trinity College, The University of Dublin, Dublin 2, Ireland
Nick Campbell & Carl Vogel &
Department of Computing Science & Mathematics, University of Stirling, FK9 4LA, Stirling, Scotland, UK
Amir Hussain
Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente, P.O. Box 217, 7500 AE, Enschede, The Netherlands
Anton Nijholt

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Albornoz, E.M., Milone, D.H., Rufiner, H.L. (2010). Multiple Feature Extraction and Hierarchical Classifiers for Emotions Recognition. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-12397-9_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12396-2
Online ISBN: 978-3-642-12397-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics