Machine Learning Approach for Emotional Speech Classification

  • Mihir Narayan Mohanty
  • Aurobinda Routray
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8947)


Recognition of Emotion from speech is an extremely challenging task in current research. Using the reduced dimension method for feature extraction, Singular Value Decomposition (SVD) has proposed. Classification using Support Vector Machines (SVM) with SVD features shows an excellent result, which is the novelty of this work. The proposed features are evaluated for the task of emotion classification using simulation method. SVM has been designed as the classifier for classifying the unseen emotions in speech. It is shown that the classifier with such features outperforms the methods substantially. Using such features for classification outperforms the accuracy level approximately 90 % that leads towards automatic recognition.


Emotional speech Feature extraction SVD Classification SVM 


  1. 1.
    Schuller, B., Batliner, A., Steidl, S., Seppi, D.: Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge. Speech Communication 53(9–10), 1062–1087 (2011)CrossRefGoogle Scholar
  2. 2.
    Bosh, L.: Emotions: what is possible in the ASR framework. In: ISCA Workshop on Speech and Emotion, Belfast (2000)Google Scholar
  3. 3.
    Lambrou, T., Kudumakis, P., Speller, R., Sandler, M., Linney, A.: Classification of audio signals using statistical features on time and wavelet transform domains. In: ASSP 1998, vol. 6, pp. 3621–3624, 12–15 May 1998Google Scholar
  4. 4.
    Ververidis, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Commun. 48, 1162–1181 (2006)CrossRefGoogle Scholar
  5. 5.
    Lee, C., Mower, E., Busso, C., Lee, S., Narayanan, S.: Emotion recognition using a hierarchical binary decision tree approach. In: Proceedings of the Interspeech, Brighton, pp. 320–323 (2009)Google Scholar
  6. 6.
    Kwon, O.-W., Chan, K., Hao, J., Lee, T.-W.: Emotion recognition by speech signals. In: Proceedings of the Interspeech, pp. 125–128 (2003)Google Scholar
  7. 7.
    Lee, C.M., Narayanan, S.S.: Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005)CrossRefGoogle Scholar
  8. 8.
    Batliner, A., Fischer, K., Huber, R., Spilker, J., Nöth, E.: Desperately seeking emotions: actors, wizards, and human beings. In: Proceedings of the ISCA Workshop on Speech and Emotion, Newcastle, Northern Ireland, pp. 195–200 (2000)Google Scholar
  9. 9.
    Ayadi, M.M.H.E., Kamel, M.S., Karray, F.: Speech emotion recognition using gaussian mixture vector autoregressive models. In: Proceedings of the ICASSP, Honolulu, HY, pp. 957–960 (2007)Google Scholar
  10. 10.
    Steidl, S., Schuller, B., Batliner, A., Seppi, D.: The hinterland of emotions: facing the open-microphone challenge. In: Proceedings of the ACII, Amsterdam, Netherlands, pp. 690–697 (2009)Google Scholar
  11. 11.
    Kharat, G.U., Dudul, S.V.: Human emotion recognition system using optimally designed SVM with different facial feature extraction techniques. WSEAS Trans. Comput. 7(6), 650–659 (2008)Google Scholar
  12. 12.
    Litman, D., Forbes, K.: Recognizing emotions from student speech in tutoring dialogues. In: Proceedings of the ASRU, Virgin Island, USA, pp. 25–30 (2003)Google Scholar
  13. 13.
    Shami, M., Verhelst, W.: Automatic classification of expressiveness in speech: a multi-corpus study. In: Müller, C. (ed.) Speaker Classifcation II. LNCS (LNAI), vol. 4441, pp. 43–56. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  14. 14.
    Chuang, Z.-J., Wu, C.-H.: Emotion recognition using acoustic features and textual content. In: Proceedings of the ICME, Taipei, Taiwan, pp. 53–56 (2004)Google Scholar
  15. 15.
    McGilloway, S., Cowie, R., Doulas-Cowie, E., Gielen, S., Westerdijk, M., Stroeve, S.: Approaching automatic recognition of emotion from voice: a rough benchmark. In: Proceedings of the ISCA Workshop on Speech and Emotion, Newcastle, Northern Ireland, pp. 207–212 (2000)Google Scholar
  16. 16.
    Morrison, D., Wang, R., Xu, W., Silva, L.C.D.: Incremental learning for spoken affect classification and its application in call centres. Int. J. Intell. Systems Technol. Appl. 2, 242–254 (2007)Google Scholar
  17. 17.
    Mohanty, M.N., Routray, A., Pradhan, A.K., Kabisatpathy, P.: Power quality disturbances classification usingsupport vector machines with optimized time-frequency kernels. Int. J. Power Electron. 4(2), 181–196 (2012)CrossRefGoogle Scholar
  18. 18.
    Frénay, B., Verleysen, M.: Using SVMs with randomised feature spaces: an extreme learning approach. In: Proceedings of ESANN, pp. 315–320 (2010)Google Scholar
  19. 19.
    Groutage, D., Bennink, D.: A new matrix decomposition based on optimum transformation of the singular value decomposition basis sets yields principal features of time-frequency distributions. In: Proceedings of the Tenth IEEE Workshop on Statistical Signal and Array Processing, August 2000)Google Scholar
  20. 20.
    Mohanty, M.N., Routray, A.: Estimation of autocorrelation space for classification of bio-medical signals. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Nanda, P.K. (eds.) SEMCCO 2012. LNCS, vol. 7677, pp. 697–704. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  21. 21.
    Haykins, S.: Neural Networks, 2nd edn. Prentice Hall, New Jersey (1999)Google Scholar
  22. 22.
    Jolliffe, I.T.: Principal Component Analysis. Springer, Berlin (2002)zbMATHGoogle Scholar
  23. 23.
    Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press, New York (1990)zbMATHGoogle Scholar
  24. 24.
    Bellman, R.: Adaptive Control Processes. Princeton University Press, Princeton (1961)zbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.ITER, Siksha ‘O’ Anusandhan UniversityBhuaneswarIndia
  2. 2.Department of Electrical EngineeringIITKharagpurIndia

Personalised recommendations