Advertisement

Multi-algorithm Fusion for Speech Emotion Recognition

  • Gyanendra K. Verma
  • U. S. Tiwary
  • Shaishav Agrawal
Part of the Communications in Computer and Information Science book series (CCIS, volume 192)

Abstract

In this paper, we have proposed a speech emotion recognition system based on multi-algorithm fusion. Mel Frequency Cepstral Coefficients (MFCC) and Discrete Wavelet Transform (DWT), the two prominent algorithms for speech analysis, have been used to extract emotion information from speech signal. MFCC, a representation of the short-term power spectrum of a sound is a classical approach to analyze speech signal whilst the DWT, a multiresolution approach mainly approximate the frequency information along with time information. Feature level fusion of algorithms has been performed after extraction of features by acoustic analysis of speech emotion signal. The final emotion state was determined by classification using Support Vector Machine. Popular Berlin emotion database is used for evaluation of the proposed system. The results achieved are very promising as the proposed fusion algorithm performed well compared to individual algorithms.

Keywords

Multi-algorithm Fusion MFCC DWT Speech Emotion Recognition 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cohn, J.F., Katz, G.S.: Bimodal expressions of emotion by face and voice. In: Workshop on Face/Gesture Recognition and their Applications, the Sixth ACM International Multimedia Conference, Bristol, England (1998)Google Scholar
  2. 2.
    Fasel, B., Luettin, J.: Automatic facial expression analysis: A survey. Pattern Recognition 36, 259–275 (2003)CrossRefzbMATHGoogle Scholar
  3. 3.
    Kudiri, K.M., Verma, G.K., Gohel, B.: Relative Amplitude based Features for Emotion Detection from Speech. In: 3rd IEEE Int. Conf. on Signal and Image Processing, pp. 301–304 (2010)Google Scholar
  4. 4.
    Rizon, M.: Discrete Wavelet Transform Based Classification of Human Emotions Using Electroencephalogram Signals. American Journal of Applied Sciences 7(7), 865–872 (2010)CrossRefGoogle Scholar
  5. 5.
    Shah, F., et al.: Discrete Wavelet Transforms and Artificial Neural Networks for Speech Emotion Recognition. International Journal of Computer Theory and Engineering 2(3), 1793–8201 (2010)Google Scholar
  6. 6.
    Kwon, O.-W.: Emotion Recognition by Speech Signals. In: EUROSPEECH-2003, Geneva (2003)Google Scholar
  7. 7.
    Mao, X.: Speech Emotion Recognition based on a Hybrid of HMM/ANN. In: Proceedings of the 7th WSEAS International Conference on Applied Informatics and Communications, Athens, Greece, August 24-26 (2007)Google Scholar
  8. 8.
    Liqin, F., et al.: Relative Speech Emotion Recognition Based Artificial Neural Network. In: IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application (2008)Google Scholar
  9. 9.
  10. 10.
    Dutta, T.: Dynamic Time Warping Based Approach to Text Dependent Speaker Identification Using Spectrograms. In: Congress on Image and Signal Processing, vol. 2, pp. 354–360 (2008)Google Scholar
  11. 11.
    Tzanetakis, G., Essl, G., Cook, P.: Audio Analysis using the Discrete Wavelet Transform. In: Proc. Conf. in Acoustics and Music Theory Applications, Skiathos, Greece (2001)Google Scholar
  12. 12.
    Lindasalwa, M., Begam, M., Elamvazuthi, I.: Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques. Jour. of Computing 2(3), 138–143 (2010)Google Scholar
  13. 13.
    Toh, A.M., Togneri, R., Northolt, S.: Spectral entropy as speech features for speech recognition. In: The Proceedings of PEECS, Perth, pp. 22–25 (2005)Google Scholar
  14. 14.
    Kan, P.L.E., Allen, T., Quigley, F.: A GMM-Based Speaker Identification System on FPGA. In: 6th International Symposium on Reconfigurable Computing: Architectures, Tools and Applications. LNCS. Bangkok, Thailand (March 2010)Google Scholar
  15. 15.
    Burkhardt, F., Paeschke, A.: A database of German emotional speech. In: Interspeech, Lisbon, Portugal, pp. 1517–1520 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Gyanendra K. Verma
    • 1
  • U. S. Tiwary
    • 1
  • Shaishav Agrawal
    • 1
  1. 1.Indian Institute of Information Technology, AllahabadAllahabadIndia

Personalised recommendations