Improved Speech Emotion Classification from Spectral Coefficient Optimization

Idris, Inshirah; Salam, Md Sah

doi:10.1007/978-3-319-32213-1_22

Inshirah Idris⁶ &
Md Sah Salam⁷

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 387))

1393 Accesses
1 Citations

Abstract

In order to improve the performance of speech emotion recognition systems, and to reduce the related computing complexity, this work proposed two approaches of spectral coefficient optimization. The two approaches are (1) optimized based on discrete spectral features and (1) combine spectral features. Experimental studies have been performed through the Berlin Emotional Database, using a support vector machine (SVM) classifier, and five spectral features including MFCC, LPC, LPCC, PLP and RASTA-PLP. The experiment results have shown that speech emotion recognition based on optimized coefficient numbers can effectively improve the performance. There were significant improvements in the accuracy 2 % for the first approach and 4 % for the second approach compared to that using the existing approaches. Moreover the second approach outperformed the first approach in the accuracy. This good accuracy came with reducing the features number.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Wang F, Sahli H, Gao J, Jiang D, Verhelst W (2014) Relevance units machine based dimensional and continuous speech emotion prediction. Multimedia Tools Appl 1–18
Google Scholar
Pierre-Yves O (2003) The production and recognition of emotions in speech: features and algorithms. Int J Hum Comput Stud 59(1):157–183
Article Google Scholar
Rong J, Li G, Chen Y-PP (2009) Acoustic feature selection for automatic emotion recognition from speech. Inf Process Manag 45(3):315–328
Article Google Scholar
Schuller B, Steidl S, Batliner A (2009) The inter-speech 2009 emotion challenge. In: INTERSPEECH, vol 2009. Citeseer, pp 312–315
Google Scholar
Lee C-C, Mower E, Busso C, Lee S, Narayanan S (2011) Emotion recognition using a hierarchical binary decision tree approach. Speech Commun 53(9):1162–1171
Article Google Scholar
Lee CM, Yildirim S, Bulut M, Kazemzadeh A, Busso C, Deng Z, Lee S, Narayanan S (2004) Emotion recognition based on phoneme classes. In: INTER-SPEECH, pp 205–211
Google Scholar
Wang Y, Guan L (2005) Recognizing human emotion from audiovisual information. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP’05), vol 2. IEEE, pp ii–1125
Google Scholar
Lugger M, Yang B (2008) Psychological motivated multi-stage emotion classification exploiting voice quality features. Speech Recognition, In-Tech, pp 395–410
Google Scholar
Schuller B, Muller R, Lang MK, Rigoll G (2005) Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles. In: INTERSPEECH, pp 805–808
Google Scholar
Kim EH, Hyun KH, Kim SH, Kwak YK (2007) Speech emotion recognition using eigen-FFT in clean and noisy environments. In: RO-MAN 2007 the 16th IEEE international symposium on robot and human interactive communication. IEEE, pp 689–694
Google Scholar
Nwe TL, Foo SW, De Silva LC (2003) Speech emotion recognition using hidden markov models. Speech Commun 41(4):603–623
Article Google Scholar
Fu L, Mao X, Chen L (2008) Speaker independent emotion recognition based on SVM/HMMS fusion system. In: International conference on audio, language and image processing, ICALIP 2008. IEEE, pp 61–65
Google Scholar
Koolagudi SG, Barthwal A, Devliyal S, Rao KS (2012) Real life emotion classification using spectral features and gaussian mixture models. Procedia Eng 38:3892–3899
Article Google Scholar
Murugappan M, Baharuddin NQI, Jerritta S (2012) DWT and MFCC based human emotional speech classification using LDA. In: 2012 International conference on biomedical engineering (ICoBE). IEEE, pp 203–206
Google Scholar
Milton A, Roy SS, Selvi S (2013) SVM scheme for speech emotion recognition using MFCC feature. Int J Comput Appl 69(9)
Google Scholar
Hegde S, Achary K, Shetty S (2015) Feature selection using fisher’s ratio technique for automatic speech recognition. arXiv preprint arXiv:1505.03239
Burkhardt F, Paeschke A, Rolfes M, Sendlmeier WF, Weiss B (2005) A database of german emotional speech. In: Interspeech, vol 5, pp 1517–1520
Google Scholar
Wu S, Falk TH, Chan W-Y (2011) Automatic speech emotion recognition using modulation spectral features. Speech Commun 53(5):768–785
Article Google Scholar
Zhang Q, An N, Wang K, Ren F, Li L (2013) Speech emotion recognition using combination of features. In: 2013 fourth international conference on intelligent control and information processing (ICICIP). IEEE, pp 523–528
Google Scholar
Kockmann M, Burget L et al (2011) Application of speaker-and language identification state-of-the-art techniques for emotion recognition. Speech Commun 53(9):1172–1185
Article Google Scholar
Waghmare VB, Deshmukh RR, Shrishrimal PP, Janvale GB (2014) Emotion recognition system from artificial marathi speech using MFCC and LDA techniques. In: Fifth international conference on advances in communication, network, and computing, CNC, 2014
Google Scholar
Kuchibhotla S, Vankayalapati H, Vaddi R, Anne K (2014) A comparative analysis of classifiers in emotion recognition through acoustic features. Int J Speech Technol 17(4):401–408
Article Google Scholar
Pao T-L, Chen Y-T, Yeh J-H, Liao W-Y (2005) Combining acoustic features for improved emotion recognition in mandarin speech. In: Affective computing and intelligent interaction. Springer, pp 279–285
Google Scholar
Ingale AB, Chaudhari D (2012) Speech emotion recognition. Int J Soft Comput Eng (IJSCE) 2231–2307. ISSN
Google Scholar
Pao T-L, Chen Y-T, Yeh J-H, Li P-J (2006) Mandarin emotional speech recognition based on SVM and NN. In: 18th international conference on pattern recognition ICPR 2006, vol 1. IEEE, pp 1096–1100
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Sudan University of Science and Technology, Khartoum, Sudan
Inshirah Idris
Software Engineering Department, Universiti Teknologi Malaysia (UTM), Skudai, Johor, Malaysia
Md Sah Salam

Authors

Inshirah Idris
View author publications
You can also search for this author in PubMed Google Scholar
Md Sah Salam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Inshirah Idris .

Editor information

Editors and Affiliations

University Teknikal Malaysia Melaka, Durian Tunggal, Melaka, Malaysia
Ping Jack Soh
Singapore Campus, #05-01 SIT Building, Newcastle University, Singapore, Singapore
Wai Lok Woo
Universiti Teknikal Malaysia Melaka, Melaka, Malaysia
Hamzah Asyrani Sulaiman
Universiti Teknikal Malaysia Melaka, Melaka, Malaysia
Mohd Azlishah Othman
University Teknikal Malaysia Melaka, Durian Tunggal, Melaka, Malaysia
Mohd Shakir Saat

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Idris, I., Salam, M.S. (2016). Improved Speech Emotion Classification from Spectral Coefficient Optimization. In: Soh, P., Woo, W., Sulaiman, H., Othman, M., Saat, M. (eds) Advances in Machine Learning and Signal Processing. Lecture Notes in Electrical Engineering, vol 387. Springer, Cham. https://doi.org/10.1007/978-3-319-32213-1_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-32213-1_22
Published: 19 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32212-4
Online ISBN: 978-3-319-32213-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics