Statistical Feature Selection for Mandarin Speech Emotion Recognition

  • Bo Xie
  • Ling Chen
  • Gen-Cai Chen
  • Chun Chen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3644)


Performance of speech emotion recognition largely depends on the acoustic features used in a classifier. This paper studies the statistical feature selection problem in Mandarin speech emotion recognition. This study was based on a speaker dependent emotional mandarin database. Pitch, energy, duration, formant related features and some velocity information were selected as base features. Some statistics of them consisted of original feature set and full stepwise discriminant analysis (SDA) was employed to select extracted features. The results of feature selection were evaluated through a LDA based classifier. Experiment results indicate that pitch, log energy, speed and 1st formant are the most important factors and the accuracy rate increases from 63.1 % to 76.5 % after feature selection. Meanwhile, the features selected by SDA are better than the results of other feature selection methods in a LDA based classifier and SVM. The best performance is achieved when the feature number is in the range of 9 to 12.


Support Vector Machine Feature Selection Emotion Recognition Feature Selection Method Feature Number 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Picard, R.W.: Affective Computing. MIT Press, Cambridge (1997)Google Scholar
  2. 2.
    Murray, I.R., Arnott, J.L.: Toward the Simulation of Emotion in Synthetic Speech: A Review of the Literature on Human Vocal Emotion. Journal of the Acoustical Society of America 93(2), 1097–1108 (1933)CrossRefGoogle Scholar
  3. 3.
    Dellaert, F., Polzin, T., Waibel, A.: Recognizing Emotion in Speech. In: Proceedings of International Conference on Spoken Language Processing, pp. 1970–1973 (1996)Google Scholar
  4. 4.
    Lee, C.M., Narayanan, S., Pieraccini, R.: Recognition of Negative Emotions from the Speech Signal. In: Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 240–243 (2001)Google Scholar
  5. 5.
    Kwon, O.W., Chan, K., Hao, J., Lee, T.W.: Emotion Recognition by Speech Signals. In: Proceedings of EUROSPEECH, pp. 125–128 (2003)Google Scholar
  6. 6.
    Wang, Z.P., Zhao, L., Zou, C.R.: Emotion Recognition of Speech using Fuzzy Entropy Effectiveness Analysis. Journal of circuits and systems 8(3), 109–112 (2003)Google Scholar
  7. 7.
    James, M.: Classification Algorithms. John Wiley & Sons, London (1985)zbMATHGoogle Scholar
  8. 8.
    Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., et al.: Emotion Recognition in Human-computer Interaction. IEEE Signal Processing Magazine 18(1), 32–80 (2001)CrossRefGoogle Scholar
  9. 9.
    Cai, L.L., Jiang, C.H., Wang, Z.P.: A Method Combining the Global and Time Series Structure Features for Emotion Recognition in Speech. In: Proceedings of International Conference on Neural Networks and Signal Processing, pp. 904–907 (2003)Google Scholar
  10. 10.
    Boersma, P., Weenink, D.: Praat Speech Processing Software. Institute of Phonetics Sciences of the University of Amsterdam,

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Bo Xie
    • 1
  • Ling Chen
    • 1
  • Gen-Cai Chen
    • 1
  • Chun Chen
    • 1
  1. 1.College of Computer ScienceZhejiang UniversityHangzhouP.R. China

Personalised recommendations