Cognitive Computation

, Volume 5, Issue 4, pp 517–525 | Cite as

Global Selection of Features for Nonlinear Dynamics Characterization of Emotional Speech

  • Patricia Henríquez Rodríguez
  • Jesús B. Alonso Hernández
  • Miguel A. Ferrer Ballester
  • Carlos M. Travieso González
  • Juan R. Orozco-Arroyave
Article

Abstract

This paper proposes the application of measures based on nonlinear dynamics for emotional speech characterization. Measures such as mutual information, dimension correlation, entropy correlation, Shannon’s entropy, Lempel–Ziv complexity and Hurst exponent are extracted from the samples of a database of emotional speech. Then, summary statistics such as mean, standard deviation, skewness and kurtosis are applied on the extracted measures. Experiments were conducted on the Berlin emotional speech database for a three-class problem (neutral, fear and anger as emotional states). Feature selection is accomplished and a methodology is proposed to find the best features. In order to evaluate the discrimination ability of the selected features, a neural network classifier is used. The global success rate is 93.78 ± 3.18 %.

Keywords

Nonlinear dynamic Emotional speech Forward floating feature selection Neural networks 

References

  1. 1.
    Yildirim S, Narayanan S, Potamianos A. Detecting emotional state of a child in a conversational computer game. Comput Speech Lang. 2011;25:29–44.CrossRefGoogle Scholar
  2. 2.
    Burkhardt F, Polzehl T, Stegmann J, Metze F, Huber R. Detecting real life anger. In: Proceedings of the IEEE international conference on acoustics, speech and signal process. Taipei: IEEE Press; 2009. p. 4761–4764.Google Scholar
  3. 3.
    Lefter I, Rothkrantz LJM, van Leeuwen DA, Wiggers P. Automatic stress detection in emergency (Telephone) calls. Int J Intell Def Support Syst. 2011;4(2):148–68.CrossRefGoogle Scholar
  4. 4.
    Polzehl T, Schmitt A, Metze F, Wagner M. Anger recognition in speech using acoustic and linguistic cues. Speech Comm. 2011;53(9–10):1198–209. doi:10.1016/j.specom.2011.05.002.Google Scholar
  5. 5.
    Wu S, Falk TH, Wai-Yip, C. Automatic recognition of speech emotion using long-term spectro-temporal features. In: Proceedings of the 16th IEEE international conference on digital signal process. Santorini, Greece; 5–7 July 2009, p. 1–6.Google Scholar
  6. 6.
    Giannakopoulos T, Pikrakis A, Theodoridis SA. Dimensional approach to emotion recognition of speech from movies. In: Proceedings of the 34th IEEE international conference on acoustic, speech and signal process. (ICASSP 2009). Taipei, Taiwan; 19–24 April 2009, p. 65–68.Google Scholar
  7. 7.
    Schuller B, Batliner A, Steidl S, Seppi D. Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge. Speech Comm. 2011;53(9–10):1062–87. doi:10.1016/j.specom.2011.01.011.Google Scholar
  8. 8.
    Burkhardt F, Paeschke A, Rolfes M, Sendlmeier WF, Weiss B. A database of German emotional speech. In: Proceedings of the 6th annual conference of the international speech communication association (Interspeech 2005), Lisbon, Portugal; 4–8 September 2005, p. 1517–1520. http://pascal.kgw.tuberlin.de/emodb/.
  9. 9.
    Wu S, Falk TH, Wai-Yip C. Automatic speech emotion recognition using modulation spectral features. Speech Comm. 2011;53:768–85.CrossRefGoogle Scholar
  10. 10.
    Henríquez P, Alonso JB, Ferrer MA, Travieso CM, Godino-Llorente JI, Díaz-de-María F. Characterization of healthy and pathological voice through measures based on nonlinear dynamics. IEEE Trans Audio Speech Lang Process. 2009;17(6):1186–95.CrossRefGoogle Scholar
  11. 11.
    Alonso JB, Díaz-de-María F, Travieso CM, Ferrer MA. Using nonlinear features for voice disorder detection. In: Proceedings of 3rd international conference nonlinear speech process. Barcelona, Spain; 2005, p. 94–106.Google Scholar
  12. 12.
    Vaziri G, Almasganj F, Jenabi MS. On the fractal self- similarity of laryngeal pathologies detection: the estimation of hurst parameter. In: Proceedings of the 5th International conference on Information Technology and Application in Biomedicine. Shenzhen, China; 2008, p. 383–386.Google Scholar
  13. 13.
    Vaziri G, Almasganj F, Behroozmand R. Pathological assessment of patients’ speech signals using nonlinear dynamical analysis. Comput Biol Med. 2010;40:54–63.PubMedCrossRefGoogle Scholar
  14. 14.
    Tsanas A, Little MA, McSharry PE, Spielman J, Ramig LO. Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease. IEEE Trans Biomed Eng. 2012;59(5):1264–71.Google Scholar
  15. 15.
    Little MA, McSharry PE, Hunter EJ, Spielman J, Ramig LO. Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans Biomed Eng. 2009;56(4):1015–22.PubMedCrossRefGoogle Scholar
  16. 16.
    Little MA, McSharry PE, Roberts SJ, Costello DA, Moroz IM. Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection. Biomed Eng Online. 2007;6:23.PubMedCrossRefGoogle Scholar
  17. 17.
    Takens F. Detecting strange attractors in turbulence. Lecture notes in math, vol. 898. New York: Springer; 1981. p. 366–81.Google Scholar
  18. 18.
    Fraser AM, Swinney HL. Independent coordinates for strange attractors from mutual information. Phys Rev A. 1986;33(2):1134–40.PubMedCrossRefGoogle Scholar
  19. 19.
    Kennel MB, Brown R, Abarbanel HDI. Determining embedding dimension for phase-space reconstruction using a geometrical construction. Phys Rev A. 1992;45(6):3403–11.PubMedCrossRefGoogle Scholar
  20. 20.
    Kantz H, Schreiber T. Nonlinear time series analysis. 2nd ed. Cambridge: Cambridge University Press; 1997.Google Scholar
  21. 21.
    Theiler J. Lacunarity in a best estimator of fractal dimension. Phys Lett A. 1988;133:195–200.CrossRefGoogle Scholar
  22. 22.
    Kaspar F, Shuster HG. Easily calculable measure for complexity of spatiotemporal patterns. Phys Rev A. 1987;36:842–8.PubMedCrossRefGoogle Scholar
  23. 23.
    Lempel A, Ziv J. On the complexity of finite sequences. IEEE Trans Inform Theory. 1976;22:75–81.CrossRefGoogle Scholar
  24. 24.
    Hurst HE, Black RP, Simaika YM. Long-term storage: an experimental study. London: Constable; 1965.Google Scholar
  25. 25.
    Pudil P, Novovicová J, Kittler J. Floating search methods in feature selection. Pattern Recognit Lett. 1994;15:1119–25.CrossRefGoogle Scholar
  26. 26.
    Ruelle D. Deterministic chaos: the science and the fiction. Proc R Soc Lond A. 1990;427:241–8.CrossRefGoogle Scholar
  27. 27.
    Kienast M, Sendlmeier WF. Acoustical analysis of spectral and temporal changes in emotional speech. In: Proceedings of the ISCA workshop on speech and emotion. Newcastle, UK; 5–7 September 2000, p. 92–97.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Patricia Henríquez Rodríguez
    • 1
  • Jesús B. Alonso Hernández
    • 1
  • Miguel A. Ferrer Ballester
    • 1
  • Carlos M. Travieso González
    • 1
  • Juan R. Orozco-Arroyave
    • 2
  1. 1.Instituto Universitario para el Desarrollo Tecnológico e Innovación en Comunicaciones (IDeTIC)Universidad de Las Palmas de Gran CanariaLas Palmas de Gran CanariaSpain
  2. 2.Departamento de Ingeniería ElectrónicaUniversidad de Antioquia. GEPAR and GITA Research GroupsMedellínColombia

Personalised recommendations