Comparison of Complementary Spectral Features of Emotional Speech for German, Czech, and Slovak

Přibil, Jiří; Přibilová, Anna

doi:10.1007/978-3-642-34584-5_20

Jiří Přibil²¹ &
Anna Přibilová²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7403))

2805 Accesses
2 Citations

Abstract

Our paper is aimed at statistical analysis and comparison of spectral features which complement vocal tract characteristics (spectral centroid, spectral flatness measure, Shannon entropy, Rényi entropy, etc.) in emotional and neutral speech of male and female voice. This experiment was realized using the German speech database EmoDB and the Czech and Slovak speech material extracted from the stories performed by professional actors. Analysis of complementary spectral features (basic and extended statistical parameters and histograms of spectral features distribution) for all three languages confirms that this approach can be used for classification of emotional speech types.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chetouani, M., Mahdhaoui, A., Ringeval, F.: Time-Scale Feature Extractions for Emotional Speech Characterization. Cognitive Computation 1, 194–201 (2009)
Article Google Scholar
Luengo, I., Navas, E., Hernáez, I.: Feature Analysis and Evaluation for Automatic Emotion Identification in Speech. IEEE Transactions on Multimedia 12, 490–501 (2010)
Article Google Scholar
Pao, T.-L., Chen, Y.-T., Yeh, J.-H., Liao, W.-Y.: Combining Acoustic Features for Improved Emotion Recognition in Mandarin Speech. In: Tao, J., Tan, T., Picard, R.W. (eds.) ACII 2005. LNCS, vol. 3784, pp. 279–285. Springer, Heidelberg (2005)
Chapter Google Scholar
Atassi, H., Riviello, M.T., Smékal, Z., Hussain, A., Esposito, A.: Emotional Vocal Expressions Recognition Using the COST 2102 Italian Database of Emotional Speech. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds.) COST 2102 Int. Training School 2009. LNCS, vol. 5967, pp. 255–267. Springer, Heidelberg (2010)
Chapter Google Scholar
Bozkurt, E., Erzin, E., Erdem, C.E., Erdem, A.T.: Formant Position Based Weighted Spectral Features for Emotion Recognition. Speech Communication 53, 1186–1197 (2011)
Article Google Scholar
Iriondo, I., et al.: Automatic Refinement of an Expressive Speech Corpus Assembling Subjective Perception and Automatic Classification. Speech Communication 51, 744–758 (2009)
Article Google Scholar
Hosseinzadeh, D., Krishnan, S.: On the Use of Complementary Spectral Features for Speaker Recognition. EURASIP Journal on Advances in Signal Processing 2008, Article ID 258184, 10 pages (2008), doi:10.1155/2008/258144
Google Scholar
Berlin Database of Emotional Speech. Department of Communication Science, Institute for Speech and Communication, Technical University Berlin, http://pascal.kgw.tu-berlin.de/emodb/ (retrieved March 13, 2006)
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A Database of German Emotional Speech. In: Proc. INTERSPEECH 2005, ISCA, Lisbon, Portugal, pp. 1517–1520 (2005)
Google Scholar
Přibil, J., Přibilová, A.: Application of Speaking Style Conversion in the Czech and Slovak TTS System with Cepstral Description. In: Proceedings of the 14th International Conference on Systems, Signals and Image Processing (IWSSIP 2007) & 6th EURASIP Conference Focused on Speech and Image Processing, Multimedia Communications and Services (EC-SIPMCS 2007), Maribor, Slovenia, pp. 289–292 (2007)
Google Scholar
Přibil, J., Přibilová, A.: Spectral Flatness Analysis for Emotional Speech Synthesis and Transformation. In: Esposito, A., Vích, R. (eds.) Cross-Modal Analysis. LNCS (LNAI), vol. 5641, pp. 106–115. Springer, Heidelberg (2009)
Chapter Google Scholar
Přibil, J., Přibilová, A.: Statistical Analysis of Complementary Spectral Features of Emotional Speech in Czech and Slovak. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS (LNAI), vol. 6836, pp. 299–306. Springer, Heidelberg (2011)
Chapter Google Scholar
Reynolds, D.A., Rose, R.C.: Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models. IEEE Transactions on Speech and Audio Processing 3, 72–83 (1995)
Article Google Scholar
Hartung, J., Makambi, H.K., Arcac, D.: An Extended ANOVA F-test with Applications to the Heterogeneity Problem in Meta-Analysis. Biometrical Journal 43(2), 135–146 (2001)
Article MathSciNet MATH Google Scholar
Volaufová, J.: Statistical Methods in Biomedical Research and Measurement Science. Measurement Science Review 5(1), 1–10 (2005)
Google Scholar
Vích, R.: Cepstral Speech Model, Padé Approximation, Excitation, and Gain Matching in Cepstral Speech Synthesis. In: Proceedings of the 15th Biennial EURASIP Conference Biosignal 2000, Brno, Czech Republic, pp. 77–82 (2000)
Google Scholar
Li, X., Liu, H., Zheng, Y., Xu, B.: Robust Speech Endpoint Detection Based on Improved Adaptive Band-Partitioning Spectral Entropy. In: Li, K., Fei, M., Irwin, G.W., Ma, S. (eds.) LSMS 2007. LNCS, vol. 4688, pp. 36–45. Springer, Heidelberg (2007)
Chapter Google Scholar
Lee, W.-S., Roh, Y.-W., Kim, D.-J., Kim, J.-H., Hong, K.-S.: Speech Emotion Recognition Using Spectral Entropy. In: Xiong, C.-H., Liu, H., Huang, Y., Xiong, Y.L. (eds.) ICIRA 2008, Part II. LNCS (LNAI), vol. 5315, pp. 45–54. Springer, Heidelberg (2008)
Chapter Google Scholar
Púčik, J., Oweis, R.: CT Image Reconstruction Approaches Applied to Time-Frequency Representation of Signals. EURASIP Journal on Applied Signal Processing 2003, 422–429 (2003)
Article MATH Google Scholar
Kar, S., Bhagat, M., Routray, A.: EEG Signal Analysis for the Assessment and Quantification of Driver’s Fatigue. Transportation Research Part F 13, 297–306 (2010)
Article Google Scholar
Poza, J., et al.: Regional Analysis of Spontaneous MEG Rhythms in Patients with Alzheimer’s Disease Using Spectral Entropy. Annals of Biomedical Engineering 36, 141–152 (2008)
Article Google Scholar
Oppenheim, A.V., Schafer, R.W., Buck, J.R.: Discrete-Time Signal Processing, 2nd edn. Prentice-Hall (1999)
Google Scholar
Boersma, P., Weenink, D.: Praat: Doing Phonetics by Computer (Version 5.2.20) [Computer Program], http://www.praat.org/ (retrieved March 25, 2011)
Hanzlíček, Z., Matoušek, J., Tihelka, D.: First Experiments on Text-to-Speech System Personification. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 186–193. Springer, Heidelberg (2009)
Chapter Google Scholar
Hanzlíček, Z.: Czech HMM-Based Speech Synthesis. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2010. LNCS, vol. 6231, pp. 291–298. Springer, Heidelberg (2010)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Measurement Science, SAS, Dúbravská cesta 9, SK-841 04, Bratislava, Slovakia
Jiří Přibil
Institute of Electronics and Photonics, Faculty of Electrical Engineering & Information Technology, Slovak University of Technology, Ilkovičova 3, SK-812 19, Bratislava, Slovakia
Anna Přibilová

Authors

Jiří Přibil
View author publications
You can also search for this author in PubMed Google Scholar
Anna Přibilová
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Psychology, and IIASS, Seconda Università degli Studi di Napoli, Italy
Anna Esposito
Istituto Nazionale di Geofisica e Vulcanologia, sezione di Napoli Osservatorio Vesuviano, Napoli, Italy
Antonietta M. Esposito
School of Computing Science, University of Glasgow, Glasgow, UK
Alessandro Vinciarelli
Laboratory of Acoustics and Speech Communication, Technische Universität Dresden, 01062, Dresden, Germany
Rüdiger Hoffmann
Dept. of Humanities and Social Sciences, Anatolia College/ACT, P.O. Box 21021, 55510, Pylaia, Greece
Vincent C. Müller

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Přibil, J., Přibilová, A. (2012). Comparison of Complementary Spectral Features of Emotional Speech for German, Czech, and Slovak. In: Esposito, A., Esposito, A.M., Vinciarelli, A., Hoffmann, R., Müller, V.C. (eds) Cognitive Behavioural Systems. Lecture Notes in Computer Science, vol 7403. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34584-5_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-34584-5_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34583-8
Online ISBN: 978-3-642-34584-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics