Abstract
In expressive speech synthesis some method of mimicking the way one specific speaker express emotions is needed. In this work we have studied the suitability of long term prosodic parameters and short term spectral parameters to reflect emotions in speech, by means of the analysis of the results of two automatic emotion classification systems. Those systems have been trained with different emotional monospeaker databases recorded in standard Basque that include six emotions. Both of them are able to differentiate among emotions for a specific speaker with very high identification rates (above 75%), but the models are not applicable to other speakers (identification rates drop to 20%). Therefore in the synthesis process the control of both spectral and prosodic features is essential to get expressive speech and when a change in speaker is desired the values of the parameters should be re-estimated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hozjan, V., Kacic, Z.: Improved Emotion Recognition with Large Set of Statistical Features. In: Proc. Eurospeech 2003, pp. 133–136 (2003)
Petrushin, V.A.: Emotion Recognition in Speech Signal: Experimental Study, Development and Application, Proc. ICSLP 2000, pp. 222–225 (2000)
Seppänen, T., Väyrynen, E., Toivanen, J.: Prosody Based Classification of Emotions in Spoken Finnish. In: Proc. Eurospeech 2003, pp. 717–720 (2003)
Yanushevskaya, I., Gobl, C., Ní Chasaide, A.: Voice quality and f0 cues for affect expression: implications for synthesis. In: Proc. INTERSPEECH, pp. 1849–1852 (2005)
Yildirim, S., Bulut, M., Lee, C., Kazemzadeh, A., Deng, Z., Lee, S., Narayanan, S., Busso, C.: An acoustic study of emotions expressed in speech. In: Proc. INTERSPEECH pp. 2193–2196 (2004)
Gobl, C., Ní Chasaide, A.: The role of voice quality in communicating emotion, mood and attitude. Speech Communication 40(1,2), 189–212 (2003)
Johnstone, T., Scherer, K.R.: The effects of emotions on voice quality. In: Proc. XIVth International Congress of Phonetic Sciences, pp. 2029–2032 (1999)
Drioli, C., Tisato, G., Cosi, P., Tesser, F.: Emotions and voice quality: experiments with sinusoidal modelling. In: Proc. VOQUAL 2003, pp. 127–132 (2003)
Scherrer, K.R.: Vocal Communication of Emotion: A Review of Research Paradigms. Speech Communication 40(1,2), 227–256 (2003)
Cowie, R., Cornelius, R.R.: Describing the Emotional States that Are Expressed in Speech. Speech Communication 40(1,2), 2–32 (2003)
Lay Nwe, T., Wei Foo, S., De Silva, L.: Speech Emotion Recognition Using Hidden Markov Models. Speech Communication 41(4), 603–623 (2003)
Boula de Mareüil, P., Célérier, P., Toen, J.: Generation of Emotions by a Morphing Technique in English, French and Spanish. In: Proc. Speech Prosody, pp. 187–190 (2002)
Navas, E., Castelruiz, A., Luengo, I., Sánchez, J., Hernáez, I.: Designing and Recording an Audiovisual Database of Emotional Speech in Basque. In: Proc. of the LREC pp. 1387–1390 (2004)
Navas, E., Hernáez, I., Luengo, I.: An objective and subjective study of the role of semantics in building corpora for TTS. IEEE transactions on Speech and Audio Processing 14, 1117–1127 (2006)
Saratxaga, I., Navas, E., Hernáez, I., Luengo, I.: Designing and Recording an Emotional Speech Database for Corpus Based Synthesis. In: Basque. Proc. of the LREC, pp. 2127–2129 (2006)
Boersma, P., van Heuven, V.: Speak and unSpeak with PRAAT. Glot International 5(9-10), 341–347 (2001)
Boersma, P., Weenink, D.: Praat: doing phonetics by computer (Version 4.3.16) [Computer program] (2005), http://www.praat.org/
Hozjan, V., Kacic, Z.: Context-Independent Multilingual Emotion Recognition from Speech Signals. International Journal of Speech Technology 6(3), 11–320 (2003)
Iida, A., Campbell, N., Higuchi, F., Yasumura, M.: A Corpus-based Speech Synthesis System with Emotion. Speech Communication 40(1,2), 161–187 (2003)
Burkhardt, F., Sendlmeier, W.F.: Verification of Acoustical Correlates of Emotional Speech using Formant-Synthesis. In: Proc. ISCA Workshop on Speech and Emotion, pp. 151–156 (2000)
Nogueiras, A., Moreno, A., Bonafonte, A., Mariño, J.B.: Speech emotion recognition using hidden Markov models. In: Proc. EUROSPEECH pp. 2679–2682 (2001)
Iwai, A., Yano, Y., Okuma, S.: Complex emotion recognition system for a specific user using SOM based on prosodic features. In: Proc. INTERSPEECH, pp. 1341–1344 (2004)
Nicholson, J., Takahashi, K., Nakatsu, R.: Emotion Recognition in Speech Using Neural Networks. Neural Computing & Applications 9(4), 290–296 (2009)
Ververidis, D., Kotropoulos, C.: Automatic Speech Classification to five emotional states based on gender information. In: Proc. of 12th. EUSIPCO, pp. 341–344 (2004)
Vogt, T., André, E.: Improving automatic emotion recognition from speech via gender differentiation. In: LREC 2006. Proc. Language Resources and Evaluation Conference, pp. 1123–1126 (2006)
Theune, M., Meijs, K., Heylen, D., Ordelman, R.: Generating expressive speech for storytelling applications. IEEE Transactions on Audio, Speech and Language Processing 14(4), 1137–1144 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Navas, E., Hernáez, I., Luengo, I., Sainz, I., Saratxaga, I., Sanchez, J. (2007). Meaningful Parameters in Emotion Characterisation. In: Esposito, A., Faundez-Zanuy, M., Keller, E., Marinaro, M. (eds) Verbal and Nonverbal Communication Behaviours. Lecture Notes in Computer Science(), vol 4775. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76442-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-76442-7_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76441-0
Online ISBN: 978-3-540-76442-7
eBook Packages: Computer ScienceComputer Science (R0)