Abstract
Our paper is aimed at statistical analysis and comparison of spectral features which complement vocal tract characteristics (spectral centroid, spectral flatness measure, Shannon entropy, Rényi entropy, etc.) in emotional and neutral speech of male and female voice. This experiment was realized using the German speech database EmoDB and the Czech and Slovak speech material extracted from the stories performed by professional actors. Analysis of complementary spectral features (basic and extended statistical parameters and histograms of spectral features distribution) for all three languages confirms that this approach can be used for classification of emotional speech types.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chetouani, M., Mahdhaoui, A., Ringeval, F.: Time-Scale Feature Extractions for Emotional Speech Characterization. Cognitive Computation 1, 194–201 (2009)
Luengo, I., Navas, E., Hernáez, I.: Feature Analysis and Evaluation for Automatic Emotion Identification in Speech. IEEE Transactions on Multimedia 12, 490–501 (2010)
Pao, T.-L., Chen, Y.-T., Yeh, J.-H., Liao, W.-Y.: Combining Acoustic Features for Improved Emotion Recognition in Mandarin Speech. In: Tao, J., Tan, T., Picard, R.W. (eds.) ACII 2005. LNCS, vol. 3784, pp. 279–285. Springer, Heidelberg (2005)
Atassi, H., Riviello, M.T., Smékal, Z., Hussain, A., Esposito, A.: Emotional Vocal Expressions Recognition Using the COST 2102 Italian Database of Emotional Speech. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds.) COST 2102 Int. Training School 2009. LNCS, vol. 5967, pp. 255–267. Springer, Heidelberg (2010)
Bozkurt, E., Erzin, E., Erdem, C.E., Erdem, A.T.: Formant Position Based Weighted Spectral Features for Emotion Recognition. Speech Communication 53, 1186–1197 (2011)
Iriondo, I., et al.: Automatic Refinement of an Expressive Speech Corpus Assembling Subjective Perception and Automatic Classification. Speech Communication 51, 744–758 (2009)
Hosseinzadeh, D., Krishnan, S.: On the Use of Complementary Spectral Features for Speaker Recognition. EURASIP Journal on Advances in Signal Processing 2008, Article ID 258184, 10 pages (2008), doi:10.1155/2008/258144
Berlin Database of Emotional Speech. Department of Communication Science, Institute for Speech and Communication, Technical University Berlin, http://pascal.kgw.tu-berlin.de/emodb/ (retrieved March 13, 2006)
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A Database of German Emotional Speech. In: Proc. INTERSPEECH 2005, ISCA, Lisbon, Portugal, pp. 1517–1520 (2005)
Přibil, J., Přibilová, A.: Application of Speaking Style Conversion in the Czech and Slovak TTS System with Cepstral Description. In: Proceedings of the 14th International Conference on Systems, Signals and Image Processing (IWSSIP 2007) & 6th EURASIP Conference Focused on Speech and Image Processing, Multimedia Communications and Services (EC-SIPMCS 2007), Maribor, Slovenia, pp. 289–292 (2007)
Přibil, J., Přibilová, A.: Spectral Flatness Analysis for Emotional Speech Synthesis and Transformation. In: Esposito, A., Vích, R. (eds.) Cross-Modal Analysis. LNCS (LNAI), vol. 5641, pp. 106–115. Springer, Heidelberg (2009)
Přibil, J., Přibilová, A.: Statistical Analysis of Complementary Spectral Features of Emotional Speech in Czech and Slovak. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS (LNAI), vol. 6836, pp. 299–306. Springer, Heidelberg (2011)
Reynolds, D.A., Rose, R.C.: Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models. IEEE Transactions on Speech and Audio Processing 3, 72–83 (1995)
Hartung, J., Makambi, H.K., Arcac, D.: An Extended ANOVA F-test with Applications to the Heterogeneity Problem in Meta-Analysis. Biometrical Journal 43(2), 135–146 (2001)
Volaufová, J.: Statistical Methods in Biomedical Research and Measurement Science. Measurement Science Review 5(1), 1–10 (2005)
Vích, R.: Cepstral Speech Model, Padé Approximation, Excitation, and Gain Matching in Cepstral Speech Synthesis. In: Proceedings of the 15th Biennial EURASIP Conference Biosignal 2000, Brno, Czech Republic, pp. 77–82 (2000)
Li, X., Liu, H., Zheng, Y., Xu, B.: Robust Speech Endpoint Detection Based on Improved Adaptive Band-Partitioning Spectral Entropy. In: Li, K., Fei, M., Irwin, G.W., Ma, S. (eds.) LSMS 2007. LNCS, vol. 4688, pp. 36–45. Springer, Heidelberg (2007)
Lee, W.-S., Roh, Y.-W., Kim, D.-J., Kim, J.-H., Hong, K.-S.: Speech Emotion Recognition Using Spectral Entropy. In: Xiong, C.-H., Liu, H., Huang, Y., Xiong, Y.L. (eds.) ICIRA 2008, Part II. LNCS (LNAI), vol. 5315, pp. 45–54. Springer, Heidelberg (2008)
Púčik, J., Oweis, R.: CT Image Reconstruction Approaches Applied to Time-Frequency Representation of Signals. EURASIP Journal on Applied Signal Processing 2003, 422–429 (2003)
Kar, S., Bhagat, M., Routray, A.: EEG Signal Analysis for the Assessment and Quantification of Driver’s Fatigue. Transportation Research Part F 13, 297–306 (2010)
Poza, J., et al.: Regional Analysis of Spontaneous MEG Rhythms in Patients with Alzheimer’s Disease Using Spectral Entropy. Annals of Biomedical Engineering 36, 141–152 (2008)
Oppenheim, A.V., Schafer, R.W., Buck, J.R.: Discrete-Time Signal Processing, 2nd edn. Prentice-Hall (1999)
Boersma, P., Weenink, D.: Praat: Doing Phonetics by Computer (Version 5.2.20) [Computer Program], http://www.praat.org/ (retrieved March 25, 2011)
Hanzlíček, Z., Matoušek, J., Tihelka, D.: First Experiments on Text-to-Speech System Personification. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 186–193. Springer, Heidelberg (2009)
Hanzlíček, Z.: Czech HMM-Based Speech Synthesis. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2010. LNCS, vol. 6231, pp. 291–298. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Přibil, J., Přibilová, A. (2012). Comparison of Complementary Spectral Features of Emotional Speech for German, Czech, and Slovak. In: Esposito, A., Esposito, A.M., Vinciarelli, A., Hoffmann, R., Müller, V.C. (eds) Cognitive Behavioural Systems. Lecture Notes in Computer Science, vol 7403. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34584-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-34584-5_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34583-8
Online ISBN: 978-3-642-34584-5
eBook Packages: Computer ScienceComputer Science (R0)