Abstract
In order to improve the performance of speech emotion recognition systems, and to reduce the related computing complexity, this work proposed two approaches of spectral coefficient optimization. The two approaches are (1) optimized based on discrete spectral features and (1) combine spectral features. Experimental studies have been performed through the Berlin Emotional Database, using a support vector machine (SVM) classifier, and five spectral features including MFCC, LPC, LPCC, PLP and RASTA-PLP. The experiment results have shown that speech emotion recognition based on optimized coefficient numbers can effectively improve the performance. There were significant improvements in the accuracy 2 % for the first approach and 4 % for the second approach compared to that using the existing approaches. Moreover the second approach outperformed the first approach in the accuracy. This good accuracy came with reducing the features number.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Wang F, Sahli H, Gao J, Jiang D, Verhelst W (2014) Relevance units machine based dimensional and continuous speech emotion prediction. Multimedia Tools Appl 1–18
Pierre-Yves O (2003) The production and recognition of emotions in speech: features and algorithms. Int J Hum Comput Stud 59(1):157–183
Rong J, Li G, Chen Y-PP (2009) Acoustic feature selection for automatic emotion recognition from speech. Inf Process Manag 45(3):315–328
Schuller B, Steidl S, Batliner A (2009) The inter-speech 2009 emotion challenge. In: INTERSPEECH, vol 2009. Citeseer, pp 312–315
Lee C-C, Mower E, Busso C, Lee S, Narayanan S (2011) Emotion recognition using a hierarchical binary decision tree approach. Speech Commun 53(9):1162–1171
Lee CM, Yildirim S, Bulut M, Kazemzadeh A, Busso C, Deng Z, Lee S, Narayanan S (2004) Emotion recognition based on phoneme classes. In: INTER-SPEECH, pp 205–211
Wang Y, Guan L (2005) Recognizing human emotion from audiovisual information. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP’05), vol 2. IEEE, pp ii–1125
Lugger M, Yang B (2008) Psychological motivated multi-stage emotion classification exploiting voice quality features. Speech Recognition, In-Tech, pp 395–410
Schuller B, Muller R, Lang MK, Rigoll G (2005) Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles. In: INTERSPEECH, pp 805–808
Kim EH, Hyun KH, Kim SH, Kwak YK (2007) Speech emotion recognition using eigen-FFT in clean and noisy environments. In: RO-MAN 2007 the 16th IEEE international symposium on robot and human interactive communication. IEEE, pp 689–694
Nwe TL, Foo SW, De Silva LC (2003) Speech emotion recognition using hidden markov models. Speech Commun 41(4):603–623
Fu L, Mao X, Chen L (2008) Speaker independent emotion recognition based on SVM/HMMS fusion system. In: International conference on audio, language and image processing, ICALIP 2008. IEEE, pp 61–65
Koolagudi SG, Barthwal A, Devliyal S, Rao KS (2012) Real life emotion classification using spectral features and gaussian mixture models. Procedia Eng 38:3892–3899
Murugappan M, Baharuddin NQI, Jerritta S (2012) DWT and MFCC based human emotional speech classification using LDA. In: 2012 International conference on biomedical engineering (ICoBE). IEEE, pp 203–206
Milton A, Roy SS, Selvi S (2013) SVM scheme for speech emotion recognition using MFCC feature. Int J Comput Appl 69(9)
Hegde S, Achary K, Shetty S (2015) Feature selection using fisher’s ratio technique for automatic speech recognition. arXiv preprint arXiv:1505.03239
Burkhardt F, Paeschke A, Rolfes M, Sendlmeier WF, Weiss B (2005) A database of german emotional speech. In: Interspeech, vol 5, pp 1517–1520
Wu S, Falk TH, Chan W-Y (2011) Automatic speech emotion recognition using modulation spectral features. Speech Commun 53(5):768–785
Zhang Q, An N, Wang K, Ren F, Li L (2013) Speech emotion recognition using combination of features. In: 2013 fourth international conference on intelligent control and information processing (ICICIP). IEEE, pp 523–528
Kockmann M, Burget L et al (2011) Application of speaker-and language identification state-of-the-art techniques for emotion recognition. Speech Commun 53(9):1172–1185
Waghmare VB, Deshmukh RR, Shrishrimal PP, Janvale GB (2014) Emotion recognition system from artificial marathi speech using MFCC and LDA techniques. In: Fifth international conference on advances in communication, network, and computing, CNC, 2014
Kuchibhotla S, Vankayalapati H, Vaddi R, Anne K (2014) A comparative analysis of classifiers in emotion recognition through acoustic features. Int J Speech Technol 17(4):401–408
Pao T-L, Chen Y-T, Yeh J-H, Liao W-Y (2005) Combining acoustic features for improved emotion recognition in mandarin speech. In: Affective computing and intelligent interaction. Springer, pp 279–285
Ingale AB, Chaudhari D (2012) Speech emotion recognition. Int J Soft Comput Eng (IJSCE) 2231–2307. ISSN
Pao T-L, Chen Y-T, Yeh J-H, Li P-J (2006) Mandarin emotional speech recognition based on SVM and NN. In: 18th international conference on pattern recognition ICPR 2006, vol 1. IEEE, pp 1096–1100
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Idris, I., Salam, M.S. (2016). Improved Speech Emotion Classification from Spectral Coefficient Optimization. In: Soh, P., Woo, W., Sulaiman, H., Othman, M., Saat, M. (eds) Advances in Machine Learning and Signal Processing. Lecture Notes in Electrical Engineering, vol 387. Springer, Cham. https://doi.org/10.1007/978-3-319-32213-1_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-32213-1_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32212-4
Online ISBN: 978-3-319-32213-1
eBook Packages: EngineeringEngineering (R0)