Speech Emotions Recognition Using 2-D Neural Classifier

  • Pavol Partila
  • Miroslav Voznak
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 210)


This article deals with a speech emotion recognition system. We discuss the usage of a neural network as the final classifier for human speech emotional state. We carried our research on a database of records of both genders and various emotional states. In the preprocessing and speech processing phase, we focused our intent on parameters dependent on the emotional state. The output of this work is a system for classifying the emotional state of a man’s voice, which is based on a neural network classifier. For output-stage classifier was used self-organizing feature map, which is specific type of artificial neural nets. The number of input parameters must be limited for hardware and time consuming computation of neurons positions. Therefore we discuss the accuracy of the classifier whose input is the fundamental frequency calculated by different methods.


Fundamental frequency Digital speech processing Emotions Neural network Auto-correlation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Nicholson, J., Takahashi, K., Nakatsu, R.: Emotion Recognition in Speech Using Neural Networks. Neural Computing & Applications 9(4), 290–296 (2006)Google Scholar
  2. 2.
    Partila, P., Voznak, M., Mikulec, M., Zdralek, J.: Fundamental Frequency Extraction Method using Central Clipping and its Importance for the Classification of Emotional State. Advances in Electrical and Electronic Engineering 10(4), 270–275 (2012)Google Scholar
  3. 3.
    Psutka, J., Muller, L., Smidl, L.: Feature space reduction and decorrelation in a large number of speech recognition experiments. In: Proc. the 9th IASTED International Conference on Signal and Image Processing, SIP 2007, Honolulu, pp. 158–161 (2007)Google Scholar
  4. 4.
    Rabiner, L.: On the use of autocorrelation analysis for pitch detection. IEEE Transactions on Acoustics, Speech and Signal Processing 25(1), 24–33 (1977)CrossRefGoogle Scholar
  5. 5.
    Kasi, K., Zahorian, S.A.: Yet Another Algorithm for Pitch Tracking. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, pp.I-361–I-364. IEEE (2002)Google Scholar
  6. 6.
    Sun, X.: Pitch Determination and Voice Quality Analysis Using Subharmonic-To-Harmonic Ratio. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 333–336. IEEE (2002)Google Scholar
  7. 7.
    Gerhard, D.: Pitch Extraction and Fundamental Frequency: History and Current Techniques. University of Regina, Regina (2003)Google Scholar
  8. 8.
    Solé-Casals, J., Martí-Puig, P., Reig-Bolaño, R., Zaiats, V.: Score Function for Voice Activity Detection. In: Solé-Casals, J., Zaiats, V. (eds.) NOLISP 2009. LNCS, vol. 5933, pp. 76–83. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  9. 9.
    Picone, J.W.: Signal modeling techniques in speech recognition. Proceedings of the IEEE 81(9), 1215–1247 (1993)CrossRefGoogle Scholar
  10. 10.
    Beale, M., Howard, B., Hudson, M.: Neural network design. Campus Publ. Service, Boulder (2002)Google Scholar
  11. 11.
    Roussinov, D., Chen, H.: A Scalable Selforganizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation. Communication Cognition and Artificial Intelligence (1998) Google Scholar
  12. 12.
    Snášel, V., Húsek, D., Frolov, A.A., Řezanková, H., Moravec, P., Polyakov, P.: Bars Problem Solving - New Neural Network Method and Comparison. In: Gelbukh, A., Kuri Morales, Á.F. (eds.) MICAI 2007. LNCS (LNAI), vol. 4827, pp. 671–682. Springer, Heidelberg (2007)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  1. 1.Dpt. of TelecommunicationsVSB-Technical University of OstravaOstrava-PorubaCzech Republic

Personalised recommendations