Global Selection of Features for Nonlinear Dynamics Characterization of Emotional Speech
This paper proposes the application of measures based on nonlinear dynamics for emotional speech characterization. Measures such as mutual information, dimension correlation, entropy correlation, Shannon’s entropy, Lempel–Ziv complexity and Hurst exponent are extracted from the samples of a database of emotional speech. Then, summary statistics such as mean, standard deviation, skewness and kurtosis are applied on the extracted measures. Experiments were conducted on the Berlin emotional speech database for a three-class problem (neutral, fear and anger as emotional states). Feature selection is accomplished and a methodology is proposed to find the best features. In order to evaluate the discrimination ability of the selected features, a neural network classifier is used. The global success rate is 93.78 ± 3.18 %.
KeywordsNonlinear dynamic Emotional speech Forward floating feature selection Neural networks
This work has been funded by the Spanish government MCINN TEC2009-14123-C04 research project and a research training grant from the ACIISI of the Canary Autonomous Government (Spain) with a co-financing rate of 85 % from the European Social Fund (ESF). This work was also granted by CODI at Universidad de Antioquia, project MC11-1-03.
- 2.Burkhardt F, Polzehl T, Stegmann J, Metze F, Huber R. Detecting real life anger. In: Proceedings of the IEEE international conference on acoustics, speech and signal process. Taipei: IEEE Press; 2009. p. 4761–4764.Google Scholar
- 5.Wu S, Falk TH, Wai-Yip, C. Automatic recognition of speech emotion using long-term spectro-temporal features. In: Proceedings of the 16th IEEE international conference on digital signal process. Santorini, Greece; 5–7 July 2009, p. 1–6.Google Scholar
- 6.Giannakopoulos T, Pikrakis A, Theodoridis SA. Dimensional approach to emotion recognition of speech from movies. In: Proceedings of the 34th IEEE international conference on acoustic, speech and signal process. (ICASSP 2009). Taipei, Taiwan; 19–24 April 2009, p. 65–68.Google Scholar
- 8.Burkhardt F, Paeschke A, Rolfes M, Sendlmeier WF, Weiss B. A database of German emotional speech. In: Proceedings of the 6th annual conference of the international speech communication association (Interspeech 2005), Lisbon, Portugal; 4–8 September 2005, p. 1517–1520. http://pascal.kgw.tuberlin.de/emodb/.
- 11.Alonso JB, Díaz-de-María F, Travieso CM, Ferrer MA. Using nonlinear features for voice disorder detection. In: Proceedings of 3rd international conference nonlinear speech process. Barcelona, Spain; 2005, p. 94–106.Google Scholar
- 12.Vaziri G, Almasganj F, Jenabi MS. On the fractal self- similarity of laryngeal pathologies detection: the estimation of hurst parameter. In: Proceedings of the 5th International conference on Information Technology and Application in Biomedicine. Shenzhen, China; 2008, p. 383–386.Google Scholar
- 14.Tsanas A, Little MA, McSharry PE, Spielman J, Ramig LO. Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease. IEEE Trans Biomed Eng. 2012;59(5):1264–71.Google Scholar
- 17.Takens F. Detecting strange attractors in turbulence. Lecture notes in math, vol. 898. New York: Springer; 1981. p. 366–81.Google Scholar
- 20.Kantz H, Schreiber T. Nonlinear time series analysis. 2nd ed. Cambridge: Cambridge University Press; 1997.Google Scholar
- 24.Hurst HE, Black RP, Simaika YM. Long-term storage: an experimental study. London: Constable; 1965.Google Scholar
- 27.Kienast M, Sendlmeier WF. Acoustical analysis of spectral and temporal changes in emotional speech. In: Proceedings of the ISCA workshop on speech and emotion. Newcastle, UK; 5–7 September 2000, p. 92–97.Google Scholar